- 
                Notifications
    You must be signed in to change notification settings 
- Fork 21
Nexthop type group (aka ECMP) #297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughAdds ECMP nexthop group support end-to-end. Public API: new enum GR_NH_T_GROUP and group info/member structs. Control plane: group nexthop type, lifecycle, import/export, and member bookkeeping. Datapath: new ip_loadbalance and ip6_loadbalance nodes that select group members via mbuf RSS hash; icmp/icmp6 local send adapted to unwrap groups and set mbuf RSS. CLI: nexthop group add/del commands and IDENT option for ping/traceroute. FRR grout integration handles group add/delete requests. Tests and build files updated to exercise and compile the new features. Pre-merge checks❌ Failed checks (1 warning)
 ✅ Passed checks (2 passed)
 📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro ⛔ Files ignored due to path filters (1)
 📒 Files selected for processing (17)
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment  | 
47c5af8    to
    3e638de      
    Compare
  
    | @coderabbitai review | 
| ✅ Actions performedReview triggered. 
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🔭 Outside diff range comments (2)
modules/ip6/control/route.c (1)
291-301: Potential NULL deref + wrong type source (LPM vs exact) + UAF on flag clear
- nh comes from rib6_lookup (LPM). It can be NULL (-> crash on nh->type).
- Even if non-NULL, using LPM to fetch type for an exact route delete is incorrect; type may not correspond to the exact rn being removed, leading to spurious EINVAL.
- Clearing nh->flags after rib6_delete risks use-after-free when ref_count was 1 (delete decrefs and may free).
Fix by looking up the exact rn, clear gateway flag before deletion, then delete using the exact nh type.
Apply this diff:
- nh = rib6_lookup(req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip); - ret = rib6_delete( - req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip, req->dest.prefixlen, nh->type - ); - if (ret == -ENOENT && req->missing_ok) - ret = 0; - - if (nh && nh->ref_count == 1) - nh->flags &= ~GR_NH_F_GATEWAY; + nh = rib6_lookup_exact( + req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip, req->dest.prefixlen + ); + if (nh == NULL) { + ret = -ENOENT; + if (req->missing_ok) + ret = 0; + } else { + /* Clear gateway flag before deletion if this was the last holder */ + if (nh->ref_count == 1) + nh->flags &= ~GR_NH_F_GATEWAY; + ret = rib6_delete( + req->vrf_id, GR_IFACE_ID_UNDEF, &req->dest.ip, req->dest.prefixlen, nh->type + ); + if (ret == -ENOENT && req->missing_ok) + ret = 0; + }modules/ip/control/route.c (1)
267-275: Potential NULL deref + wrong type source (LPM vs exact) + UAF on flag clear
- nh comes from rib4_lookup (LPM). It can be NULL (-> crash on nh->type).
- Using LPM to fetch type for an exact delete can mismatch the route's actual type, causing EINVAL spuriously.
- Clearing nh->flags after rib4_delete risks use-after-free when ref_count was 1 (delete decrefs and may free).
Fix by using exact lookup to get the nh to delete, clear the gateway flag before deletion if it's the last holder, then delete with the exact nh type.
Apply this diff:
- nh = rib4_lookup(req->vrf_id, req->dest.ip); - ret = rib4_delete(req->vrf_id, req->dest.ip, req->dest.prefixlen, nh->type); - if (ret == -ENOENT && req->missing_ok) - ret = 0; - - if (nh && nh->ref_count == 1) - nh->flags &= ~GR_NH_F_GATEWAY; + nh = rib4_lookup_exact(req->vrf_id, req->dest.ip, req->dest.prefixlen); + if (nh == NULL) { + ret = -ENOENT; + if (req->missing_ok) + ret = 0; + } else { + /* Clear gateway flag before deletion if this was the last holder */ + if (nh->ref_count == 1) + nh->flags &= ~GR_NH_F_GATEWAY; + ret = rib4_delete(req->vrf_id, req->dest.ip, req->dest.prefixlen, nh->type); + if (ret == -ENOENT && req->missing_ok) + ret = 0; + }
🧹 Nitpick comments (10)
subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1)
106-106: Avoid hidden drift: make multipath a meson optionConsider a meson option (e.g., -Dfrr_multipath=64) to keep this in sync with GR_NH_GROUP_MAX and ease overrides in CI.
Example snippet within this meson file (conceptual):
- Define option in project options: option('frr_multipath', type: 'integer', value: 64)
- Replace 64 with get_option('frr_multipath')
modules/ip/cli/route.c (1)
82-98: Also render GROUP nexthops in NEXT_HOP columnBlackhole rendering looks good. For completeness and consistency with new types, add a GROUP branch (e.g., show “group” or “group()”) instead of falling through to AF-based formatting.
- if (route->nh.type == GR_NH_T_BLACKHOLE) - scols_line_sprintf(line, 2, "blackhole"); - else - switch (route->nh.af) { + if (route->nh.type == GR_NH_T_BLACKHOLE) { + scols_line_sprintf(line, 2, "blackhole"); + } else if (route->nh.type == GR_NH_T_GROUP) { + if (route->nh.nh_id != GR_NH_ID_UNSET) + scols_line_sprintf(line, 2, "group(%u)", route->nh.nh_id); + else + scols_line_sprintf(line, 2, "group"); + } else { + switch (route->nh.af) { case GR_AF_UNSPEC: if (iface_from_id(c, route->nh.iface_id, &iface) < 0) scols_line_sprintf(line, 2, "%u", route->nh.iface_id); else scols_line_sprintf(line, 2, "%s", iface.name); break; case GR_AF_IP4: scols_line_sprintf(line, 2, IP4_F, &route->nh.ipv4); break; case GR_AF_IP6: scols_line_sprintf(line, 2, IP6_F, &route->nh.ipv6); break; } + }modules/ip6/cli/route.c (1)
82-98: Mirror GROUP nexthop rendering for IPv6Same suggestion as IPv4: render GROUP explicitly (e.g., “group” or “group()”) so users don’t see misleading AF-based info for groups.
- if (route->nh.type == GR_NH_T_BLACKHOLE) - scols_line_sprintf(line, 2, "blackhole"); - else - switch (route->nh.af) { + if (route->nh.type == GR_NH_T_BLACKHOLE) { + scols_line_sprintf(line, 2, "blackhole"); + } else if (route->nh.type == GR_NH_T_GROUP) { + if (route->nh.nh_id != GR_NH_ID_UNSET) + scols_line_sprintf(line, 2, "group(%u)", route->nh.nh_id); + else + scols_line_sprintf(line, 2, "group"); + } else { + switch (route->nh.af) { case GR_AF_UNSPEC: if (iface_from_id(c, route->nh.iface_id, &iface) < 0) scols_line_sprintf(line, 2, "%u", route->nh.iface_id); else scols_line_sprintf(line, 2, "%s", iface.name); break; case GR_AF_IP4: scols_line_sprintf(line, 2, IP4_F, &route->nh.ipv4); break; case GR_AF_IP6: scols_line_sprintf(line, 2, IP6_F, &route->nh.ipv6); break; } + }modules/ip/datapath/ip_input.c (1)
32-41: Guard against out-of-range nexthop types indexing nh_type_edgesnh_type_edges has a fixed size (256). If gr_nh_type_t ever exceeds this bound, indexing invokes UB. Add a compile-time guard where the enum is defined or here as a sanity check.
Example (outside this hunk, near gr_nexthop.h or alongside nh_type_edges):
_Static_assert(GR_NH_T_MAX < 256, "nh_type_edges[] size too small for nexthop types");Also applies to: 159-162
modules/ip6/datapath/ip6_loadbalance.c (1)
31-46: Defensive checks for group selectionTwo edge cases worth guarding:
- d->nh should be non-NULL and type GROUP when this node runs.
- Selected member nh->nh_group[idx] could be NULL (e.g., transient membership).
A tiny guard prevents crashes and routes such cases to NO_NEXTHOP.
@@ - nh = d->nh; + nh = d->nh; edge = OUTPUT; - if (nh->nh_grp_count == 0) { + if (!nh || nh->nh_grp_count == 0) { edge = NO_NEXTHOP; goto next; } - d->nh = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count]; + { + const struct nexthop *m = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count]; + if (m == NULL) { + edge = NO_NEXTHOP; + goto next; + } + d->nh = m; + }modules/ip/datapath/ip_loadbalance.c (1)
31-46: Harden against NULL group members and unexpected nhSame as IPv6 path: guard d->nh and the selected member pointer to avoid crashes in transient or misconfigured scenarios.
@@ - nh = d->nh; + nh = d->nh; edge = OUTPUT; - if (nh->nh_grp_count == 0) { + if (!nh || nh->nh_grp_count == 0) { edge = NO_NEXTHOP; goto next; } - d->nh = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count]; + { + const struct nexthop *m = nh->nh_group[mbuf->hash.rss % nh->nh_grp_count]; + if (m == NULL) { + edge = NO_NEXTHOP; + goto next; + } + d->nh = m; + }modules/infra/control/nexthop.c (1)
672-676: Nit: resetting type to L3 in free() is redundantnexthop_decref() memset()s the object right after calling ops->free(), so setting nh->type to L3 is unnecessary. Safe to remove for clarity.
static void nh_type_group_free(struct nexthop *nh) { - nh->type = GR_NH_T_L3; memset(&nh->nh_group, 0, sizeof(nh->nh_group)); }frr/rt_grout.c (1)
863-889: Ensure AF is UNSPEC for BLACKHOLE nexthopsYou pre-set AF from the NHE AFI earlier. For BLACKHOLE, force AF to UNSPEC to avoid leaking an unrelated AF down the stack.
case NEXTHOP_TYPE_BLACKHOLE: - gr_nh->type = GR_NH_T_BLACKHOLE; + gr_nh->type = GR_NH_T_BLACKHOLE; + gr_nh->af = GR_AF_UNSPEC; break;modules/infra/api/gr_nexthop.h (1)
77-78: Optional: static assert GR_NH_GROUP_MAX fits nh_grp_count type
nh_grp_countisuint8_t. Consider a compile-time check to prevent future regressions ifGR_NH_GROUP_MAXchanges.Example:
_Static_assert(GR_NH_GROUP_MAX <= UINT8_MAX, "nh_grp_count is uint8_t");modules/infra/cli/nexthop.c (1)
389-395: Fix help text for CLEAR commandThe CLEAR variant empties a group, not appends. Update the description for clarity.
Apply this diff:
- "Append a nexthop to a group.", + "Clear all nexthops from a group.",
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (18)
- frr/rt_grout.c(5 hunks)
- modules/infra/api/gr_nexthop.h(5 hunks)
- modules/infra/api/nexthop.c(3 hunks)
- modules/infra/cli/nexthop.c(7 hunks)
- modules/infra/control/nexthop.c(3 hunks)
- modules/ip/cli/route.c(1 hunks)
- modules/ip/control/route.c(1 hunks)
- modules/ip/datapath/icmp_local_send.c(1 hunks)
- modules/ip/datapath/ip_input.c(2 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/cli/route.c(1 hunks)
- modules/ip6/control/route.c(1 hunks)
- modules/ip6/datapath/icmp6_local_send.c(1 hunks)
- modules/ip6/datapath/ip6_input.c(2 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (7)
modules/ip/cli/route.c (3)
cli/table.c (1)
scols_line_sprintf(10-23)modules/infra/cli/iface.c (1)
iface_from_id(130-143)modules/infra/control/gr_iface.h (1)
iface(15-21)
modules/ip/datapath/ip_loadbalance.c (6)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(461-488)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip6/datapath/ip6_loadbalance.c (1)
loadbalance_register(51-54)modules/ip/datapath/ip_input.c (1)
ip_input_register_nexthop_type(34-41)modules/ip/datapath/ip_output.c (1)
ip_output_register_nexthop_type(45-52)
modules/ip6/datapath/ip6_loadbalance.c (6)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(461-488)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip/datapath/ip_loadbalance.c (1)
loadbalance_register(51-54)modules/ip6/datapath/ip6_input.c (1)
ip6_input_register_nexthop_type(35-42)modules/ip6/datapath/ip6_output.c (1)
ip6_output_register_nexthop_type(41-48)
modules/infra/control/nexthop.c (5)
main/api.c (1)
init(493-495)modules/ip/datapath/fib4.c (1)
init(167-169)modules/ip6/datapath/fib6.c (1)
init(181-183)main/event.c (1)
gr_event_register_serializer(34-54)main/module.c (1)
gr_register_module(55-67)
frr/rt_grout.c (3)
modules/infra/api/nexthop.c (2)
nh_add(92-165)
nh_del(173-195)modules/infra/cli/nexthop.c (2)
nh_add(61-106)
nh_del(108-121)frr/zebra_dplane_grout.c (1)
grout_client_send_recv(179-216)
modules/infra/api/nexthop.c (5)
modules/infra/control/gr_iface.h (1)
iface(15-21)modules/infra/cli/nexthop.c (3)
nh_add(61-106)
nh_group_set(262-285)
nh_group_show(123-165)main/gr_module.h (1)
api_out(16-19)modules/infra/control/nexthop.c (2)
nexthop_lookup_by_id(481-488)
nexthop_incref(535-537)modules/ip/api/dnat44.c (1)
_init(176-181)
modules/infra/cli/nexthop.c (5)
cli/gr_cli.h (1)
arg_u32(70-76)cli/ecoli.c (2)
arg_str(85-95)
with_help(15-24)modules/infra/api/nexthop.c (3)
nh_group_show(299-319)
nh_group_set(249-291)
nh_list(220-241)api/gr_api_client_impl.h (1)
gr_api_client_send_recv(67-129)api/gr_net_types.h (1)
gr_af_name(31-41)
🔇 Additional comments (20)
modules/ip/datapath/meson.build (2)
19-19: ip_loadbalance.c added to build: LGTMBuild integration looks correct.
19-19: ip_loadbalance node registration verifiedThe file exists and correctly registers its datapath node and drop handler in modules/ip/datapath/ip_loadbalance.c:
- Line 71: GR_NODE_REGISTER(info_loadbalance)
- Line 72: GR_DROP_REGISTER(ip_lb_no_nexthop)
subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1)
106-106: Multipath=64 support confirmed
FRR’s./configure --enable-multipath=64is supported in 10.x (up to 999) and zebra enforces the compile-time ECMP limit of 64. No changes needed.modules/ip6/datapath/meson.build (2)
13-13: ip6_loadbalance.c added to build: LGTMConsistent with IPv6 GROUP nexthop support.
13-13: ip6_loadbalance.c: file and node registration confirmedmodules/ip6/datapath/ip6_loadbalance.c exists and registers:
- GR_NODE_REGISTER(info_loadbalance)
- GR_DROP_REGISTER(ip6_lb_no_nexthop)
No further changes needed.
modules/ip6/datapath/ip6_input.c (1)
154-154: BLACKHOLE path wiring looks goodRegistering GR_NH_T_BLACKHOLE to ip6_blackhole and adding the drop node closes the loop for IPv6 blackholes. No further action from my side.
Also applies to: 189-189
modules/ip/datapath/ip_input.c (2)
159-162: BLACKHOLE nexthop wiring looks correctRegistering GR_NH_T_BLACKHOLE to "ip_blackhole" integrates cleanly with the nh_type_edges lookup and the drop path.
190-195: Drop-node registration for ip_blackhole is consistentRegistering the ip_blackhole drop path complements the registration above and ensures packets are dropped properly.
modules/ip6/datapath/ip6_loadbalance.c (1)
51-54: Registration matches IPv4 and keeps graph consistentMapping GROUP to "ip6_forward" on input and to "ip6_loadbalance" on output mirrors IPv4 and fits the pipeline.
modules/ip/datapath/ip_loadbalance.c (1)
51-54: Input/output nexthop-type wiring is appropriateRegistration mirrors IPv6 and maintains the intended path to the load-balancer node.
modules/infra/control/nexthop.c (3)
309-323: Allowing ops registration for BLACKHOLE and GROUP is correctSwitch extension is fine; invalidations are properly ABORTed for duplicates/missing callbacks.
331-346: Type whitelisting includes BLACKHOLE and GROUPCreation guard now permits the new types; good gatekeeping before allocation.
682-687: Registering GROUP type ops at init is goodEnsures equality/free hooks are available before use.
frr/rt_grout.c (2)
193-200: BLACKHOLE nexthop translation to FRR is correctMapping to NEXTHOP_TYPE_BLACKHOLE with bh_type=BLACKHOLE_NULL and AF_UNSPEC is aligned with FRR expectations. Weight=1 default is reasonable.
823-825: Delegation to grout_add_del_nexthop_group is the right splitEarly group handling keeps single-NH flow untouched. Good separation of concerns.
modules/infra/api/gr_nexthop.h (1)
158-162: String names for new types: LGTMMappings for BLACKHOLE and GROUP are consistent and user-friendly.
modules/infra/cli/nexthop.c (1)
62-80: nh_add control flow for blackhole/group: LGTMEarly exit to the send-path for BLACKHOLE and GROUP types is clean and avoids unnecessary AF/MAC parsing.
Also applies to: 101-106
modules/infra/api/nexthop.c (3)
42-46: nh_add type dispatch and validation: LGTM (clear separation and errno semantics)Clean split between BLACKHOLE/GROUP/L3 paths; validation helpers return errno-style codes for early API-out; later device ops return negatives and are normalized at the end. This flow is consistent.
Also applies to: 87-91, 99-116
186-189: Double-check ops->del for BLACKHOLE
nh_delallows BLACKHOLE types, but unconditionally callsops->del(nh)usingnh->af. For non-L3 (e.g., BLACKHOLE),afmay beGR_AF_UNSPEC. Confirm thatnexthop_af_ops_get(GR_AF_UNSPEC)is valid, or guard the call for non-L3 types.Would you like me to scan for the
nexthop_af_ops_getimplementation and its callers to verify it gracefully handlesGR_AF_UNSPEC?
299-319: nh_group_show response allocation/size: LGTMFixed-size response with bounded array; count set appropriately; server-side is correct.
        
          
                modules/infra/cli/nexthop.c
              
                Outdated
          
        
      | static cmd_status_t nh_group_set(const struct gr_api_client *c, const struct ec_pnode *p) { | ||
| const struct ec_pnode *n = ec_pnode_find(p, "NH_ID"); | ||
| const struct ec_strvec *v = ec_pnode_get_strvec(n); | ||
| struct gr_nh_group_set_req req = {.nh_grp_count = 0}; | ||
|  | ||
| if (arg_u32(p, "ID", &req.group_id) < 0) | ||
| return CMD_ERROR; | ||
|  | ||
| if (n) { | ||
| if (v == NULL) | ||
| return CMD_ERROR; | ||
|  | ||
| req.nh_grp_count = ec_strvec_len(v); | ||
| for (size_t i = 0; i < req.nh_grp_count; i++) { | ||
| const char *str = ec_strvec_val(v, i); | ||
| req.nh_ids[i] = atoi(str); | ||
| } | ||
| } | ||
|  | ||
| if (gr_api_client_send_recv(c, GR_NH_GROUP_SET, sizeof(req), &req, NULL) < 0) | ||
| return CMD_ERROR; | ||
|  | ||
| return CMD_SUCCESS; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid atoi overflow; parse NH_IDs safely and bound-check nh_grp_count
atoi returns int and silently overflows above INT_MAX. You accept NH_ID up to UINT32_MAX-1 in the grammar, so values in (INT_MAX, UINT32_MAX) will be mis-parsed. Also, defensively ensure nh_grp_count <= GR_NH_GROUP_MAX even if the parser already limits it.
Apply this diff:
   if (n) {
     if (v == NULL)
       return CMD_ERROR;
-    req.nh_grp_count = ec_strvec_len(v);
+    req.nh_grp_count = ec_strvec_len(v);
+    if (req.nh_grp_count > GR_NH_GROUP_MAX)
+      return CMD_ERROR;
     for (size_t i = 0; i < req.nh_grp_count; i++) {
       const char *str = ec_strvec_val(v, i);
-      req.nh_ids[i] = atoi(str);
+      char *end = NULL;
+      errno = 0;
+      unsigned long tmp = strtoul(str, &end, 10);
+      if (errno != 0 || end == str || *end != '\0' || tmp > UINT32_MAX - 1 || tmp == 0)
+        return CMD_ERROR;
+      req.nh_ids[i] = (uint32_t)tmp;
     }
   }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| static cmd_status_t nh_group_set(const struct gr_api_client *c, const struct ec_pnode *p) { | |
| const struct ec_pnode *n = ec_pnode_find(p, "NH_ID"); | |
| const struct ec_strvec *v = ec_pnode_get_strvec(n); | |
| struct gr_nh_group_set_req req = {.nh_grp_count = 0}; | |
| if (arg_u32(p, "ID", &req.group_id) < 0) | |
| return CMD_ERROR; | |
| if (n) { | |
| if (v == NULL) | |
| return CMD_ERROR; | |
| req.nh_grp_count = ec_strvec_len(v); | |
| for (size_t i = 0; i < req.nh_grp_count; i++) { | |
| const char *str = ec_strvec_val(v, i); | |
| req.nh_ids[i] = atoi(str); | |
| } | |
| } | |
| if (gr_api_client_send_recv(c, GR_NH_GROUP_SET, sizeof(req), &req, NULL) < 0) | |
| return CMD_ERROR; | |
| return CMD_SUCCESS; | |
| } | |
| static cmd_status_t nh_group_set(const struct gr_api_client *c, const struct ec_pnode *p) { | |
| const struct ec_pnode *n = ec_pnode_find(p, "NH_ID"); | |
| const struct ec_strvec *v = ec_pnode_get_strvec(n); | |
| struct gr_nh_group_set_req req = {.nh_grp_count = 0}; | |
| if (arg_u32(p, "ID", &req.group_id) < 0) | |
| return CMD_ERROR; | |
| if (n) { | |
| if (v == NULL) | |
| return CMD_ERROR; | |
| req.nh_grp_count = ec_strvec_len(v); | |
| if (req.nh_grp_count > GR_NH_GROUP_MAX) | |
| return CMD_ERROR; | |
| for (size_t i = 0; i < req.nh_grp_count; i++) { | |
| const char *str = ec_strvec_val(v, i); | |
| char *end = NULL; | |
| errno = 0; | |
| unsigned long tmp = strtoul(str, &end, 10); | |
| if (errno != 0 || end == str || *end != '\0' || tmp > UINT32_MAX - 1 || tmp == 0) | |
| return CMD_ERROR; | |
| req.nh_ids[i] = (uint32_t)tmp; | |
| } | |
| } | |
| if (gr_api_client_send_recv(c, GR_NH_GROUP_SET, sizeof(req), &req, NULL) < 0) | |
| return CMD_ERROR; | |
| return CMD_SUCCESS; | |
| } | 
3e638de    to
    e95840a      
    Compare
  
    | @coderabbitai review | 
| ✅ Actions performedReview triggered. 
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
modules/infra/control/nexthop.c (1)
662-674: Group equality now checks both count and members — resolves prior concernComparing nh_grp_count first and iterating over the active entries fixes the previously-flagged equality issue where differing sizes could be considered equal.
🧹 Nitpick comments (1)
modules/infra/control/nexthop.c (1)
676-679: Avoid mutating nh->type in free; also clear nh_grp_count for claritySetting nh->type to L3 inside the free callback is redundant (the caller memset’s the whole struct right after). Clearing nh_grp_count makes intent explicit if this callback is ever reused outside the current decref path.
Apply this diff:
static void nh_type_group_free(struct nexthop *nh) { - nh->type = GR_NH_T_L3; - memset(&nh->nh_group, 0, sizeof(nh->nh_group)); + nh->nh_grp_count = 0; + memset(&nh->nh_group, 0, sizeof(nh->nh_group)); }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (12)
- frr/rt_grout.c(5 hunks)
- modules/infra/api/gr_nexthop.h(5 hunks)
- modules/infra/api/nexthop.c(3 hunks)
- modules/infra/cli/nexthop.c(7 hunks)
- modules/infra/control/nexthop.c(3 hunks)
- modules/ip/datapath/icmp_local_send.c(1 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/datapath/icmp6_local_send.c(1 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
- modules/ip/datapath/meson.build
- modules/ip/datapath/icmp_local_send.c
- modules/ip6/datapath/meson.build
- modules/ip/datapath/ip_loadbalance.c
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch
- modules/ip6/datapath/icmp6_local_send.c
- frr/rt_grout.c
- modules/ip6/datapath/ip6_loadbalance.c
- modules/infra/api/gr_nexthop.h
- modules/infra/cli/nexthop.c
- modules/infra/api/nexthop.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04)
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04)
🔇 Additional comments (3)
modules/infra/control/nexthop.c (3)
315-317: Extending type-ops registry to BLACKHOLE/GROUP looks correct; confirm BLACKHOLE needs no opsThe switch now recognizes both types. Since there’s no BLACKHOLE ops registered in init, this implies no type-specific equality/free is needed for BLACKHOLE. Please confirm that assumption holds across call sites (e.g., no reliance on ops->free/equal for BLACKHOLE).
341-343: Allowing BLACKHOLE/GROUP in nexthop_new is correctCreation path now accepts both types and preserves existing validation semantics.
690-691: Registering GROUP type ops at init — good wiringEnsures GROUP gets the equal/free hooks early; integrates cleanly with nexthop_equal and decref.
3a36b0d    to
    4250823      
    Compare
  
    4250823    to
    1fc6a93      
    Compare
  
    9c3570d    to
    7a544da      
    Compare
  
    | rebased on main, will clean & test that PR this week before switching from draft to "ready to review". | 
a52c387    to
    cca8e2e      
    Compare
  
    | ✅ Actions performedReview triggered. 
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (12)
modules/ip6/cli/icmp6.c (2)
144-147: Clamp default IDENT to 1..UINT16_MAX to match CLI contract.CLI restricts IDENT to [1, UINT16_MAX], but default uses raw random() which can be 0 and over 16 bits before truncation. Generate in-range explicitly.
-uint16_t ident = random(); +uint16_t ident = (uint16_t)(random() % UINT16_MAX) + 1;And similarly in traceroute:
-uint16_t ident = random(); +uint16_t ident = (uint16_t)(random() % UINT16_MAX) + 1;Also applies to: 156-158, 180-182, 187-189
215-230: CLI UX: help strings are clear; keep IPv4/IPv6 option phrasing aligned.Minor nit: ensure capitalization is consistent (“ICMP ident field”) across both IPv4/IPv6 for polish.
Also applies to: 236-249
modules/ip/cli/icmp.c (1)
134-136: Default IDENT should honor 1..UINT16_MAX like the CLI.Avoid 0 and implicit narrowing from random().
-uint16_t ident = random(); +uint16_t ident = (uint16_t)(random() % UINT16_MAX) + 1;And in traceroute:
-uint16_t ident = random(); +uint16_t ident = (uint16_t)(random() % UINT16_MAX) + 1;Also applies to: 145-147, 162-163, 166-168
smoke/config_test.sh (2)
19-21: Verify multi-invocation semantics when adding the same group.Two consecutive “add group id 333” commands: confirm this appends members rather than replacing the prior set. If it replaces, member 42’s weight 102 would be lost.
Optionally consolidate into one command including weights for all members for clarity:
grcli nexthop add group id 333 members nh 42 weight 102 nh 45 nh 47
19-21: Add lightweight checks to assert group contents.After creation (and before deletion), assert group 333 exists and has 3 members with the expected weight so regressions are caught early.
# Example (adjust to actual output format): grcli nexthop show | grep -A3 'group .* id 333' || fail "group 333 missing" grcli nexthop show | grep 'nh 42 .*weight 102' || fail "weight missing for nh 42"Also applies to: 45-46
modules/ip6/datapath/meson.build (1)
13-15: ip6_loadbalance.c added to build: good. Consider adding a unit test stub.Add a minimal test target to exercise basic node registration to prevent future link/ABI regressions.
modules/ip/datapath/meson.build (1)
18-20: ip_loadbalance.c included: good. Mirror IPv6 by adding a test stub.Consider adding a basic test entry for ip_loadbalance to catch early integration issues.
modules/infra/cli/gr_cli_iface.h (1)
7-8: Header hygiene: avoid unnecessary includes in public headers.This header does not use symbols from <gr_nexthop.h>. Prefer removing it to reduce transitive dependencies and rebuild times.
-#include <gr_nexthop.h>If some downstream TU required it implicitly, include it there instead.
modules/ip/datapath/icmp_local_send.c (1)
13-14: Unused include.
<gr_vec.h>is not used in this file. Remove to keep includes lean.-#include <gr_vec.h>modules/infra/control/nexthop.c (3)
820-831: Equality is order-sensitive — verify intended semanticsCurrent comparison requires identical order and weights. If group order is not meaningful, consider order-insensitive comparison (e.g., sort by nh_id then compare).
833-845: Break early in group_list_add for minor speedupOnce found, no need to scan the rest.
gr_vec_foreach (group, groups) { if (group == nh) - found = true; + { found = true; break; } }
846-853: Stop iterating after deletion in group_list_delDeleting while iterating via swap is fine, but continue/bare loop is unnecessary and can be confusing.
gr_vec_foreach (group, groups) { if (group == nh) - gr_vec_del_swap(groups, __i); + { gr_vec_del_swap(groups, __i); break; } }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (18)
- frr/rt_grout.c(3 hunks)
- modules/infra/api/gr_nexthop.h(3 hunks)
- modules/infra/cli/gr_cli_iface.h(1 hunks)
- modules/infra/cli/nexthop.c(5 hunks)
- modules/infra/control/gr_nh_control.h(1 hunks)
- modules/infra/control/nexthop.c(7 hunks)
- modules/ip/cli/icmp.c(5 hunks)
- modules/ip/datapath/icmp_local_send.c(3 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/cli/icmp6.c(8 hunks)
- modules/ip6/datapath/icmp6_local_send.c(3 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- smoke/config_test.sh(2 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- smoke/ip_loadbalance_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- modules/infra/api/gr_nexthop.h
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- modules/infra/cli/gr_cli_iface.h
- modules/infra/control/gr_nh_control.h
- modules/ip/datapath/icmp_local_send.c
- frr/rt_grout.c
- modules/ip/datapath/ip_loadbalance.c
- modules/ip6/datapath/ip6_loadbalance.c
- modules/ip6/cli/icmp6.c
- modules/ip/cli/icmp.c
- modules/ip6/datapath/icmp6_local_send.c
- modules/infra/cli/nexthop.c
- modules/infra/control/nexthop.c
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/config_test.sh
- smoke/ip_loadbalance_frr_test.sh
- smoke/ip_loadbalance_test.sh
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
smoke/ip_loadbalance_test.sh
[warning] 5-5: Quote this to prevent word splitting.
(SC2046)
[warning] 7-7: run_id is referenced but not assigned.
(SC2154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
- GitHub Check: deb
- GitHub Check: rpm
🔇 Additional comments (24)
subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1)
106-106: Confirm nexthop group limit alignment and commit sequencing.The search found no explicit nexthop group size limit definition matching the 128 multipath value set in the FRR patch. The
128values found in the codebase are buffer sizes and CLI parameters, not group capacity limits. Verify this is intentional (FRR and datapath handle sizing independently) or if a corresponding limit needs to be defined or documented in the codebase.Additionally, confirm that this patch is sequenced correctly in the commit order—specifically, that the increased multipath setting appears alongside or before the code that relies on it.
modules/ip6/cli/icmp6.c (1)
28-30: IDENT propagation is consistent end-to-end.Passing IDENT through request/receive and both call sites looks correct. LGTM.
Also applies to: 42-45, 50-57, 170-171, 201-202
modules/ip/cli/icmp.c (2)
36-38: IDENT parameter threading looks good.Request/receive structs and both call sites use the provided IDENT consistently. LGTM.
Also applies to: 50-52, 57-59, 152-153, 175-176
189-200: CLI grammar change looks fine; confirm ordering tolerance.Parser likely accepts options in any order; if order matters, ensure docs and parsing are aligned after moving IDENT before VRF in traceroute.
Also applies to: 206-215
modules/ip/datapath/icmp_local_send.c (1)
116-120: RSS flag now set with hash: resolved.Setting
mbuf->hash.rssandRTE_MBUF_F_RX_RSS_HASHaddresses prior TX-path hints about hash availability. LGTM.modules/ip6/datapath/icmp6_local_send.c (2)
105-108: LGTM!RSS hashing is properly configured for ECMP/active-active scenarios. Both the hash value and the flag are set correctly, addressing the past review comment.
50-55: Code change is correct; no action needed.The zero-count check (line 52) combined with the construction logic ensures members are non-NULL. In
nexthop.clines 877–883,members[i].nhis only assigned whennexthop_lookup_by_id()succeeds; on failure, the group creation exits early with error handling. Thus, any member that exists is guaranteed non-NULL.modules/infra/control/gr_nh_control.h (1)
57-65: LGTM!The group member and nexthop_info_group structures are well-defined with appropriate types.
smoke/ip_loadbalance_frr_test.sh (1)
1-60: LGTM!The FRR load-balancing test script is well-structured and exercises ECMP routing scenarios appropriately.
modules/infra/cli/nexthop.c (2)
306-346: LGTM!The group add function handles memory and errors correctly. Using
arg_u32avoids theatoioverflow issues mentioned in past comments.
478-500: LGTM!CLI command structure for nexthop groups is well-defined with appropriate constraints (1-128 members).
smoke/ip_loadbalance_test.sh (1)
1-56: LGTM!The load-balancing test script properly exercises ECMP routing with nexthop groups.
frr/rt_grout.c (3)
292-297: LGTM!Group nexthop translation to FRR format is straightforward and correct.
581-610: LGTM!The nexthop group add function properly handles memory allocation and error cases. The use of
callocaddresses past concerns about uninitialized data.
804-805: LGTM!Routing to the group handler when groups are present is the correct approach.
modules/ip/datapath/ip_loadbalance.c (1)
37-42: No changes needed; NULL member concern is invalid.The member initialization in
modules/infra/control/nexthop.c(lines 869-894) guarantees that.nhpointers are never NULL. When creating a group, if any nexthop lookup fails, the entire members array is freed and an error is returned. Only valid groups with all members present are used in datapath code. The TODO acknowledges that RSS hash may be missing from some drivers, not that members can be NULL.modules/infra/control/nexthop.c (8)
33-33: Global groups registry looks fineStatic vector is OK; freed in nh_fini via gr_vec_free. No return checks needed per guidelines.
317-323: Type wiring for GR_NH_T_GROUP is correctRegistration paths and constructor switch now recognize GROUP. Looks consistent with existing pattern.
Also applies to: 340-345
507-507: Good: detach from groups before route cleanupOrder makes sense; with the decref fix above, this prevents dangling refs.
Please confirm no other paths remove members (avoid double-decref).
518-519: Interface cleanup hook is appropriateEnsures group membership prunes when iface-bound NHs are cleared.
648-648: Freeing groups vector on finiCorrect cleanup; gr_vec_free sets pointer to NULL per guidelines.
854-862: group_free lifecycle OKDecrefs members, frees array, removes from registry. Safe since QSBR sync happens before ops->free is called in nexthop_decref.
931-937: group_nh_ops table is coherentCallbacks align with implemented functions.
945-946: Registering GROUP ops during init — goodFits existing init sequence with serializer and module registration.
c03a8b0    to
    98cdddb      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
modules/ip6/datapath/ip6_loadbalance.c (1)
32-42: Add NULL check for group members; RSS fallback missing.Line 42 has the same critical issue as
ip_loadbalance.c:g->members[mbuf->hash.rss % g->n_members].nhmay be NULL whennexthop_lookup_by_id()fails during group population. The RSS hash may also be absent.Apply the same fix as suggested for the IPv4 version:
- Check for RSS hash availability; use deterministic fallback if absent.
- Scan up to
g->n_memberspositions (wrapping) to find first non-NULL member.- If all NULL, route to
NO_NEXTHOPedge.modules/infra/control/nexthop.c (1)
489-503: Ref leak and skipped element still present (duplicate issue).This function has two critical problems that were flagged in a previous review:
- No
nexthop_decrefcall for the removed member leaks a reference.- After swap-and-shrink,
i++skips the swapped-in element—duplicates won't be removed.Apply the fix from the prior review:
static void nh_groups_remove_member(const struct nexthop *nh) { struct nexthop_info_group *info; struct nexthop *group; gr_vec_foreach (group, groups) { info = nexthop_info_group(group); - for (uint32_t i = 0; i < info->n_members; i++) { + for (uint32_t i = 0; i < info->n_members; ) { if (info->members[i].nh == nh) { + nexthop_decref(info->members[i].nh); info->members[i].nh = info->members[info->n_members - 1].nh; info->members[i].weight = info->members[info->n_members - 1].weight; info->n_members--; + continue; } + i++; } } }
🧹 Nitpick comments (1)
modules/infra/control/nexthop.c (1)
833-844: Consider avoiding duplicate checks entirely.Linear search to detect duplicates is O(n), but if the caller guarantees uniqueness or if typical group counts are small, this may be acceptable.
If duplicate prevention is essential and group counts grow, consider maintaining a flag or using the fact that
gr_vec_addcan just append without checking.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (18)
- frr/rt_grout.c(3 hunks)
- modules/infra/api/gr_nexthop.h(3 hunks)
- modules/infra/cli/gr_cli_iface.h(1 hunks)
- modules/infra/cli/nexthop.c(5 hunks)
- modules/infra/control/gr_nh_control.h(1 hunks)
- modules/infra/control/nexthop.c(7 hunks)
- modules/ip/cli/icmp.c(5 hunks)
- modules/ip/datapath/icmp_local_send.c(3 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/cli/icmp6.c(8 hunks)
- modules/ip6/datapath/icmp6_local_send.c(3 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- smoke/config_test.sh(2 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- smoke/ip_loadbalance_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- modules/ip6/datapath/icmp6_local_send.c
- modules/ip/datapath/icmp_local_send.c
- modules/ip/datapath/meson.build
- modules/infra/cli/gr_cli_iface.h
- modules/ip6/cli/icmp6.c
- modules/ip/cli/icmp.c
- modules/ip6/datapath/meson.build
- modules/infra/cli/nexthop.c
🧰 Additional context used
📓 Path-based instructions (2)
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/ip_loadbalance_frr_test.sh
- smoke/config_test.sh
- smoke/ip_loadbalance_test.sh
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- modules/ip/datapath/ip_loadbalance.c
- modules/infra/control/nexthop.c
- modules/infra/api/gr_nexthop.h
- frr/rt_grout.c
- modules/infra/control/gr_nh_control.h
- modules/ip6/datapath/ip6_loadbalance.c
🧬 Code graph analysis (7)
smoke/ip_loadbalance_frr_test.sh (2)
smoke/_init_frr.sh (3)
create_interface(9-28)
set_ip_address(30-61)
set_ip_route(84-124)smoke/_init.sh (1)
netns_add(93-99)
smoke/config_test.sh (1)
modules/infra/control/gr_nh_control.h (1)
nexthop(22-28)
modules/ip/datapath/ip_loadbalance.c (5)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(522-549)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip/datapath/ip_input.c (1)
ip_input_register_nexthop_type(36-43)modules/ip/datapath/ip_output.c (1)
ip_output_register_nexthop_type(47-54)
modules/infra/control/nexthop.c (4)
api/gr_errno.h (2)
errno_set(9-12)
errno_set_null(14-17)modules/infra/datapath/main_loop.c (1)
gr_datapath_rcu(300-302)main/event.c (1)
gr_event_register_serializer(34-54)main/module.c (1)
gr_register_module(49-61)
frr/rt_grout.c (1)
frr/zebra_dplane_grout.c (1)
grout_client_send_recv(335-372)
modules/ip6/datapath/ip6_loadbalance.c (6)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(522-549)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip/datapath/ip_loadbalance.c (1)
loadbalance_register(53-56)modules/ip6/datapath/ip6_input.c (1)
ip6_input_register_nexthop_type(35-42)modules/ip6/datapath/ip6_output.c (1)
ip6_output_register_nexthop_type(41-48)
smoke/ip_loadbalance_test.sh (1)
smoke/_init.sh (1)
netns_add(93-99)
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
smoke/ip_loadbalance_test.sh
[warning] 5-5: Quote this to prevent word splitting.
(SC2046)
[warning] 7-7: run_id is referenced but not assigned.
(SC2154)
🔇 Additional comments (10)
subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1)
106-106: LGTM: Multipath value aligns with ECMP support.Increasing multipath to 128 is appropriate for the nexthop group functionality being added. This addresses the prior placement concern since the change is now integrated with the broader ECMP implementation in this PR.
modules/infra/api/gr_nexthop.h (1)
103-112: LGTM: Group nexthop structures are well-formed.The
gr_nexthop_group_memberandgr_nexthop_info_groupstructures correctly model ECMP groups with a flexible array member pattern.modules/infra/control/gr_nh_control.h (1)
57-65: LGTM: Internal group structures correctly use nexthop pointers.The
nh_group_memberandnexthop_info_groupfollow the established pattern and properly usegr_vecfor dynamic membership.smoke/config_test.sh (1)
19-20: LGTM: Group nexthop test coverage added.Tests both weighted and unweighted group members, consistent with the new API.
frr/rt_grout.c (2)
581-610: LGTM: Group nexthop handling correctly allocates and populates members.The length calculation (lines 587-588) correctly accounts for the flexible array member,
calloczero-initializes all fields, and the loop (lines 601-604) stays within bounds.
804-805: LGTM: Group path correctly prioritized in nexthop add/del flow.Groups are handled via
grout_add_nexthop_groupbefore falling through to single-nexthop logic.modules/ip/datapath/ip_loadbalance.c (1)
32-42: Remove the suggested NULL check—group members are always valid when group is active.The group creation path in
modules/infra/control/nexthop.cvalidates all member lookups before committing (lines 877–886): if any lookup fails, the entire update is aborted. Only fully populated groups enter the datapath, sog->members[i].nhcannot be NULL.The RSS hash TODO (line 41) is legitimate—
mbuf->hash.rssis undefined whenRTE_MBUF_F_RX_RSS_HASHis absent—but this is already acknowledged and doesn't require the complex scanning logic suggested.Likely an incorrect or invalid review comment.
modules/infra/control/nexthop.c (3)
820-831: LGTM: Group equality correctly compares members.The function now properly compares
n_membersfirst before checking array elements, addressing the prior concern.
854-861: LGTM: Group lifecycle and API conversion look correct.
group_freeproperly decrements member refs and cleans up.group_to_apicorrectly sets*len = 0on allocation failure (fixing the prior issue).Also applies to: 906-929
931-936: LGTM: Group type ops properly registered.The
group_nh_opsstructure and registration at line 945 correctly wire GROUP type support into the nexthop framework.Also applies to: 945-945
98cdddb    to
    1f82b7f      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
modules/ip/datapath/icmp_local_send.c (1)
49-54: Guard against NULL member and preserve const-correctness.The selected
g->members[ident % g->n_members].nhcan be NULL, causing a crash at line 65. Also, casting away const is unnecessary.As noted in the previous review, apply const-correctness and add a NULL check:
- struct nexthop_info_group *g = (struct nexthop_info_group *)gw->info; + const struct nexthop_info_group *g = (const struct nexthop_info_group *)gw->info; if (g->n_members == 0) return errno_set(EHOSTUNREACH); - gw = g->members[ident % g->n_members].nh; + const struct nexthop *m = g->members[ident % g->n_members].nh; + if (m == NULL) + return errno_set(EHOSTUNREACH); + gw = m;
🧹 Nitpick comments (1)
frr/rt_grout.c (1)
581-610: Consider adding a bounds check for n_members.While the dynamic allocation ensures sufficient buffer size, adding a sanity check on
n_memberswould guard against protocol limits or unexpected large values.static enum zebra_dplane_result grout_add_nexthop_group(struct zebra_dplane_ctx *ctx) { uint32_t nh_id = dplane_ctx_get_nhe_id(ctx); struct gr_nexthop_info_group *group; struct gr_nh_add_req *req = NULL; size_t len; + size_t n_members = dplane_ctx_get_nhe_nh_grp_count(ctx); + if (n_members > 256) { // or appropriate protocol limit + gr_log_err("nexthop-group %u has too many members (%zu)", nh_id, n_members); + return ZEBRA_DPLANE_REQUEST_FAILURE; + } + - len = sizeof(*req) + sizeof(*group) - + dplane_ctx_get_nhe_nh_grp_count(ctx) * sizeof(group->members[0]); + len = sizeof(*req) + sizeof(*group) + n_members * sizeof(group->members[0]); if ((req = calloc(1, len)) == NULL) return ZEBRA_DPLANE_REQUEST_FAILURE;
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
- frr/rt_grout.c(3 hunks)
- modules/ip/datapath/icmp_local_send.c(2 hunks)
- modules/ip6/datapath/icmp6_local_send.c(3 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- smoke/ip_loadbalance_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- modules/ip6/datapath/icmp6_local_send.c
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch
🧰 Additional context used
📓 Path-based instructions (2)
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/ip_loadbalance_frr_test.sh
- smoke/ip_loadbalance_test.sh
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- modules/ip/datapath/icmp_local_send.c
- frr/rt_grout.c
🧬 Code graph analysis (4)
smoke/ip_loadbalance_frr_test.sh (2)
smoke/_init_frr.sh (3)
create_interface(9-28)
set_ip_address(30-61)
set_ip_route(84-124)smoke/_init.sh (1)
netns_add(93-99)
modules/ip/datapath/icmp_local_send.c (1)
api/gr_errno.h (1)
errno_set(9-12)
frr/rt_grout.c (1)
frr/zebra_dplane_grout.c (1)
grout_client_send_recv(335-372)
smoke/ip_loadbalance_test.sh (1)
smoke/_init.sh (1)
netns_add(93-99)
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
smoke/ip_loadbalance_test.sh
[warning] 5-5: Quote this to prevent word splitting.
(SC2046)
[warning] 7-7: run_id is referenced but not assigned.
(SC2154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: deb
- GitHub Check: rpm
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
🔇 Additional comments (5)
modules/ip/datapath/icmp_local_send.c (1)
115-118: LGTM!The RSS hash setup correctly sets both
mbuf->hash.rssand theRTE_MBUF_F_RX_RSS_HASHflag, enabling ECMP traffic distribution for groups and active/active bonds.frr/rt_grout.c (2)
292-297: LGTM!Group nexthop translation correctly initializes fields and sets
nh_familytoAF_UNSPEC.
804-805: LGTM!Group handling is correctly prioritized before single-nexthop processing.
smoke/ip_loadbalance_frr_test.sh (1)
1-60: LGTM!The test correctly sets up ECMP routing and validates reachability through both paths. The shellcheck warnings about
run_idand quoting are false positives—run_idis defined in the sourced_init_frr.shanddirnameoutput is safe.smoke/ip_loadbalance_test.sh (1)
1-56: LGTM!The test correctly demonstrates ECMP group creation, local/external traffic distribution, and cleanup. The shellcheck warnings are false positives—
run_idis defined in_init.sh.
1f82b7f    to
    6c8142d      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
modules/ip6/datapath/ip6_loadbalance.c (1)
37-42: NULL check for group members still missing.The past review correctly identified that
g->members[i].nhcan be NULL (pernexthop.c:880), yet line 42 dereferences without checking. Add a NULL-check loop to skip NULL members and implement RSS fallback per the TODO.modules/infra/control/nexthop.c (1)
489-503: Ref leak and skipped element in member removal.Past review identified that removing a member leaks its reference (no
nexthop_decrefcall) and the swap-and-shrink skips the swapped-in element. Duplicates won't be removed.Apply the past review's fix:
static void nh_groups_remove_member(const struct nexthop *nh) { struct nexthop_info_group *info; struct nexthop *group; gr_vec_foreach (group, groups) { info = nexthop_info_group(group); - for (uint32_t i = 0; i < info->n_members; i++) { - if (info->members[i].nh == nh) { - info->members[i].nh = info->members[info->n_members - 1].nh; - info->members[i].weight = info->members[info->n_members - 1].weight; - info->n_members--; - } - } + for (uint32_t i = 0; i < info->n_members; ) { + if (info->members[i].nh == nh) { + nexthop_decref(info->members[i].nh); + info->members[i] = info->members[info->n_members - 1]; + info->n_members--; + continue; + } + i++; + } } }
🧹 Nitpick comments (5)
modules/ip/datapath/meson.build (1)
18-18: Build integration looks good; consider adding test coverage.The addition of
ip_loadbalance.cto the build is correct. However, since this introduces new loadbalancing logic for ECMP, consider adding a corresponding test entry in thetestsarray to validate the loadbalance behavior.frr/rt_grout.c (1)
292-297: GROUP mapping: confirm nh->type semantics; drop redundant assignments.nh->ifindex/vrf_id/weight are already set at Lines 193-196; consider removing duplicates. Also, should nh->type be set for GROUP (or explicitly left unset)? Please confirm expected behavior for zebra_nhg_kernel_find().
modules/infra/cli/gr_cli_iface.h (1)
7-7: Remove unused header include to reduce coupling.gr_cli_iface.h doesn’t use gr_nexthop types; consider dropping the include or moving it to the .c that needs it.
smoke/ip_loadbalance_frr_test.sh (2)
50-56: Add readiness wait after manual vtysh route.Unlike set_ip_route, this path doesn’t wait for the route to appear, which can flake tests. Add a short retry loop.
vtysh <<-EOF configure terminal ip route 192.0.0.0/24 172.16.2.2 EOF + +# Wait for route to be visible (mirror set_ip_route behavior) +max_tries=5 +count=0 +grep_pattern="^0[[:space:]]\+192\.0\.0\.0/24\>.*\<172\.16\.2\.2\>" +while ! grcli route show vrf 0 | grep -q "$grep_pattern"; do + if [ "$count" -ge "$max_tries" ]; then + echo "Route 192.0.0.0/24 via 172.16.2.2 not found after $max_tries attempts." + exit 1 + fi + sleep 1 + count=$((count + 1)) +done
14-17: Confirm run_id is initialized by the sourced helpers.If run_id is not exported by _init_frr.sh/_init.sh, consider defaulting it (e.g., run_id=${run_id:-r}) to avoid accidental empty names.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (18)
- frr/rt_grout.c(3 hunks)
- modules/infra/api/gr_nexthop.h(3 hunks)
- modules/infra/cli/gr_cli_iface.h(1 hunks)
- modules/infra/cli/nexthop.c(5 hunks)
- modules/infra/control/gr_nh_control.h(1 hunks)
- modules/infra/control/nexthop.c(7 hunks)
- modules/ip/cli/icmp.c(5 hunks)
- modules/ip/datapath/icmp_local_send.c(2 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/cli/icmp6.c(8 hunks)
- modules/ip6/datapath/icmp6_local_send.c(3 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- smoke/config_test.sh(2 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- smoke/ip_loadbalance_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- modules/ip/datapath/ip_loadbalance.c
- modules/infra/api/gr_nexthop.h
- modules/ip6/datapath/icmp6_local_send.c
- smoke/config_test.sh
- modules/ip/cli/icmp.c
- modules/ip/datapath/icmp_local_send.c
- modules/ip6/cli/icmp6.c
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- modules/infra/cli/gr_cli_iface.h
- frr/rt_grout.c
- modules/infra/control/nexthop.c
- modules/infra/cli/nexthop.c
- modules/ip6/datapath/ip6_loadbalance.c
- modules/infra/control/gr_nh_control.h
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/ip_loadbalance_frr_test.sh
- smoke/ip_loadbalance_test.sh
🧬 Code graph analysis (6)
frr/rt_grout.c (1)
frr/zebra_dplane_grout.c (1)
grout_client_send_recv(335-372)
modules/infra/control/nexthop.c (2)
api/gr_errno.h (2)
errno_set(9-12)
errno_set_null(14-17)modules/infra/datapath/main_loop.c (1)
gr_datapath_rcu(300-302)
modules/infra/cli/nexthop.c (3)
cli/gr_cli.h (1)
arg_u32(72-78)api/gr_api.h (1)
gr_api_client_send_recv(41-52)cli/ecoli.c (1)
with_help(15-24)
modules/ip6/datapath/ip6_loadbalance.c (6)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(522-549)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip/datapath/ip_loadbalance.c (1)
loadbalance_register(53-56)modules/ip6/datapath/ip6_input.c (1)
ip6_input_register_nexthop_type(35-42)modules/ip6/datapath/ip6_output.c (1)
ip6_output_register_nexthop_type(41-48)
smoke/ip_loadbalance_frr_test.sh (2)
smoke/_init_frr.sh (3)
create_interface(9-28)
set_ip_address(30-61)
set_ip_route(84-124)smoke/_init.sh (1)
netns_add(93-99)
smoke/ip_loadbalance_test.sh (1)
smoke/_init.sh (1)
netns_add(93-99)
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
smoke/ip_loadbalance_test.sh
[warning] 5-5: Quote this to prevent word splitting.
(SC2046)
[warning] 7-7: run_id is referenced but not assigned.
(SC2154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
- GitHub Check: build-cross-aarch64
- GitHub Check: deb
- GitHub Check: rpm
🔇 Additional comments (21)
subprojects/packagefiles/frr/meson-add-dependency-definition.patch (1)
106-106: Verify multipath=128 is supported in nexthop group implementation.The codebase search did not surface explicit validation or limits on nexthop group member count. Confirm that the control plane and datapath handle 128-member groups correctly, particularly in load-balancing logic (modules/ip/datapath/ip_loadbalance.c, modules/ip6/datapath/ip6_loadbalance.c) and any group member iteration loops.
frr/rt_grout.c (1)
804-806: Early dispatch to group add path looks correct.Defers to grout_add_nexthop_group when members exist; matches the new flow.
modules/ip6/datapath/meson.build (1)
13-13: Wiring LGTM.ip6_loadbalance.c added to build; consistent with ECMP integration.
modules/infra/cli/nexthop.c (2)
114-125: Group formatter: formatting looks good.Using unsigned formats and SAFE_BUF is correct.
139-144: Registration LGTM.group_formatter properly wired with type GR_NH_T_GROUP.
modules/infra/control/gr_nh_control.h (2)
57-60: LGTM: Clean group member structure.The
nh_group_memberstruct is well-defined with nexthop pointer and weight for ECMP load balancing.
62-65: LGTM: Type info follows established pattern.The
nexthop_info_groupdefinition is consistent with other nexthop type info structures.smoke/ip_loadbalance_test.sh (1)
1-56: Test script structure is sound.The ECMP test covers local and external traffic scenarios appropriately. Cleanup at line 56 removes only the group nexthop; full teardown may rely on
_init.sh.modules/ip6/datapath/ip6_loadbalance.c (2)
14-25: LGTM: Standard node structure.Edge definitions and process function signature follow the established pattern.
53-74: LGTM: Registration and node definition.Nexthop type registration and node setup mirror the IPv4 implementation correctly.
modules/infra/control/nexthop.c (11)
33-33: LGTM: Groups tracking vector.Global groups vector declared appropriately for tracking GROUP nexthops.
317-317: LGTM: Type registration.GR_NH_T_GROUP case added consistently.
340-340: LGTM: Constructor case.GROUP type added to nexthop_new switch.
507-507: LGTM: Group cleanup integration.Calls to
nh_groups_remove_memberare placed appropriately in cleanup paths (though the function itself needs fixes per previous comment).Also applies to: 518-518
648-648: LGTM: Groups vector cleanup.Proper teardown of the groups vector in module finalization.
820-831: LGTM: Group equality check.Properly compares member count and then individual members/weights.
833-856: LGTM: Group list management.
group_list_addandgroup_list_delcorrectly maintain the groups vector.
858-865: LGTM: Group lifecycle cleanup.
group_freecorrectly decrements member refs and removes from tracking list.
867-906: LGTM: Import function corrected.Previous allocation and refcount issues have been addressed. Allocation is now correct (line 874), and old members are properly decremented (lines 898-899).
908-931: LGTM: API conversion function.
group_to_apicorrectly converts internal representation to public API, and the allocation failure path now properly sets output length (line 917).
933-947: LGTM: Operations and registration.Group nexthop operations struct and registration are complete and correct.
6c8142d    to
    e652941      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
frr/rt_grout.c (2)
292-297: Remove redundant field assignments.Lines 193-195 already initialize
ifindex,vrf_id, andweightfor all nexthop types. The assignments in lines 293-295 are duplicates.Apply this diff:
case GR_NH_T_GROUP: - nh->ifindex = gr_nh->iface_id; - nh->vrf_id = gr_nh->vrf_id; *nh_family = AF_UNSPEC; - nh->weight = 1; break;
588-589: Add overflow check for group member count.The length calculation can overflow if
dplane_ctx_get_nhe_nh_grp_count(ctx)returns a very large value. Although the build configuration limits multipath to 128, defensive bounds checking prevents potential issues.Apply this diff:
+ size_t count = dplane_ctx_get_nhe_nh_grp_count(ctx); + size_t member_sz = sizeof(group->members[0]); + size_t hdr_sz = sizeof(*req) + sizeof(*group); + + if (count > (SIZE_MAX - hdr_sz) / member_sz) { + gr_log_err("nexthop group %u has too many members: %zu", nh_id, count); + return ZEBRA_DPLANE_REQUEST_FAILURE; + } + - len = sizeof(*req) + sizeof(*group) - + dplane_ctx_get_nhe_nh_grp_count(ctx) * sizeof(group->members[0]); + len = hdr_sz + count * member_sz;
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
- frr/rt_grout.c(3 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- frr/rt_grout.c
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/ip_loadbalance_frr_test.sh
🧬 Code graph analysis (2)
frr/rt_grout.c (1)
frr/zebra_dplane_grout.c (1)
grout_client_send_recv(335-372)
smoke/ip_loadbalance_frr_test.sh (2)
smoke/_init_frr.sh (3)
create_interface(9-28)
set_ip_address(30-61)
set_ip_route(84-124)smoke/_init.sh (1)
netns_add(93-99)
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
- GitHub Check: build-cross-aarch64
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
- GitHub Check: deb
- GitHub Check: rpm
🔇 Additional comments (2)
frr/rt_grout.c (1)
806-807: LGTM!The group delegation logic correctly checks for nexthop groups and routes them to the appropriate handler before falling through to the single-nexthop path.
smoke/ip_loadbalance_frr_test.sh (1)
14-16: No issues found.The
run_idvariable is properly exported by the test harness through the sourcing chain:ip_loadbalance_frr_test.shsources_init_frr.sh(line 12), which sources_init.sh(line 6), whererun_idis defined (line 104).
        
          
                modules/infra/cli/gr_cli_iface.h
              
                Outdated
          
        
      | #pragma once | ||
|  | ||
| #include <gr_infra.h> | ||
| #include <gr_nexthop.h> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you move the include from modules/infra/cli/nexthop.c to here?
        
          
                modules/infra/cli/nexthop.c
              
                Outdated
          
        
      | "group [(id ID)] members MEMBERS", | ||
| nh_group_add, | ||
| "Add a new nexthop group.", | ||
| with_help("Nexthop ID.", ec_node_uint("ID", 1, UINT32_MAX - 1, 10)), | ||
| with_help( | ||
| "Nexthop member IDs with relative weights.", | ||
| ec_node_many( | ||
| "MEMBERS", | ||
| EC_NODE_CMD( | ||
| EC_NO_ID, | ||
| "nh NHID [weight WEIGHT]", | ||
| ec_node_uint("NHID", 1, UINT32_MAX - 1, 10), | ||
| ec_node_uint("WEIGHT", 1, UINT32_MAX - 1, 10) | ||
| ), | ||
| 1, | ||
| 128 | ||
| ) | ||
| ) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to:
"group [(id ID)] (member MEMBER)+",
...
	with_help(
		"Nexthop member ID with relative weight.",
		EC_NODE_CMD(
			"MEMBER",
			"nh NHID [weight WEIGHT]",
			ec_node_uint("NHID", 1, UINT32_MAX - 1, 10),
			ec_node_uint("WEIGHT", 1, UINT32_MAX - 1, 10)
		)
	)| // Fake RSS to spread the traffic | ||
| // for ECMP routes or active/active bonds. | ||
| mbuf->hash.rss = msg->ident; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for later: compute rss for locally generated packets
e652941    to
    0ab2ae9      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️  Outside diff range comments (1)
modules/infra/api/gr_nexthop.h (1)
88-101: Fix invalid empty addr member; provide a correct overlaystruct { } addr; is invalid and breaks &addr call sites. Replace with an overlay that aliases the largest address type and keeps direct ipv4/ipv6 access.
struct gr_nexthop_info_l3 { @@ - union { - struct { - } addr; - ip4_addr_t ipv4; - struct rte_ipv6_addr ipv6; - }; + union { + // Keep direct fields for typed access. + struct { + ip4_addr_t ipv4; + struct rte_ipv6_addr ipv6; + }; + // Alias used by generic helpers/macros. + struct rte_ipv6_addr addr; + };
♻️ Duplicate comments (5)
modules/infra/cli/nexthop.c (1)
486-500: CLI grammar should require at least one MEMBERUse “(member MEMBER)+” to prevent zero-member groups at parse-time. This matches runtime expectations.
- "group [(id ID)] (member MEMBER)*", + "group [(id ID)] (member MEMBER)+",modules/ip6/datapath/icmp6_local_send.c (1)
50-56: Guard against NULL group member after moduloYou handle empty groups; also check the selected member pointer before deref to avoid NULL.
Suggested:
if (gw->type == GR_NH_T_GROUP) { struct nexthop_info_group *g = (struct nexthop_info_group *)gw->info; if (g->n_members == 0) return errno_set(EHOSTUNREACH); - gw = g->members[ident % g->n_members].nh; + const struct nexthop *m = g->members[ident % g->n_members].nh; + if (m == NULL) + return errno_set(EHOSTUNREACH); + gw = m; }modules/ip/datapath/ip_loadbalance.c (1)
41-42: RSS may be absent; add fallback (softrss) or guardWhen RTE_MBUF_F_RX_RSS_HASH isn’t set, mbuf->hash.rss can be 0, skewing distribution. Add a software hash fallback or at least guard and randomize.
frr/rt_grout.c (1)
581-612: Harden allocation size math and unify cleanuplen can overflow for large member counts. Also prefer a single out path. Suggested patch:
-static enum zebra_dplane_result grout_add_nexthop_group(struct zebra_dplane_ctx *ctx) { - enum zebra_dplane_result ret = ZEBRA_DPLANE_REQUEST_SUCCESS; +static enum zebra_dplane_result grout_add_nexthop_group(struct zebra_dplane_ctx *ctx) { + enum zebra_dplane_result ret = ZEBRA_DPLANE_REQUEST_FAILURE; uint32_t nh_id = dplane_ctx_get_nhe_id(ctx); struct gr_nexthop_info_group *group; struct gr_nh_add_req *req = NULL; - size_t len; + size_t len, count, hdr, msz; - len = sizeof(*req) + sizeof(*group) - + dplane_ctx_get_nhe_nh_grp_count(ctx) * sizeof(group->members[0]); - if ((req = calloc(1, len)) == NULL) - return ZEBRA_DPLANE_REQUEST_FAILURE; + count = dplane_ctx_get_nhe_nh_grp_count(ctx); + hdr = sizeof(*req) + sizeof(*group); + msz = sizeof(((struct gr_nexthop_info_group *)0)->members[0]); + if (count > (SIZE_MAX - hdr) / msz) + goto out; + len = hdr + count * msz; + if ((req = calloc(1, len)) == NULL) + goto out; @@ - group = (struct gr_nexthop_info_group *)req->nh.info; - group->n_members = dplane_ctx_get_nhe_nh_grp_count(ctx); + group = (struct gr_nexthop_info_group *)req->nh.info; + group->n_members = count; @@ - if (grout_client_send_recv(GR_NH_ADD, len, req, NULL) < 0) - ret = ZEBRA_DPLANE_REQUEST_FAILURE; - - free(req); - return ret; + if (grout_client_send_recv(GR_NH_ADD, len, req, NULL) < 0) + goto out; + ret = ZEBRA_DPLANE_REQUEST_SUCCESS; +out: + free(req); + return ret; }modules/ip6/datapath/ip6_loadbalance.c (1)
37-42: Add NULL check for group members; implement RSS fallback.Past review correctly identified that
g->members[mbuf->hash.rss % g->n_members].nhcan be NULL (evidence from nexthop.c:880 shows members are assigned only when lookup succeeds). Additionally, the TODO at line 41 remains unaddressed—mbuf->hash.rssis used unconditionally even whenRTE_MBUF_F_RX_RSS_HASHis not set.Apply this diff to add NULL-check loop and RSS fallback:
edge = OUTPUT; if (unlikely(g->n_members == 0)) { edge = NO_NEXTHOP; goto next; } - // TODO: increment xstat on ! mbuf->ol_flags & RTE_MBUF_F_RX_RSS_HASH - d->nh = g->members[mbuf->hash.rss % g->n_members].nh; + + uint32_t idx; + if (mbuf->ol_flags & RTE_MBUF_F_RX_RSS_HASH) { + idx = mbuf->hash.rss % g->n_members; + } else { + // TODO: increment xstat for missing RSS + idx = 0; // fallback to first member + } + + // Scan for first non-NULL member starting at idx + struct nexthop *nh = NULL; + for (uint32_t j = 0; j < g->n_members; j++) { + uint32_t try_idx = (idx + j) % g->n_members; + if (g->members[try_idx].nh != NULL) { + nh = g->members[try_idx].nh; + break; + } + } + + if (nh == NULL) { + edge = NO_NEXTHOP; + goto next; + } + d->nh = nh;
🧹 Nitpick comments (11)
modules/ip6/datapath/icmp6_local_send.c (1)
14-14: Unused include?gr_vec.h appears unused here. Drop it if not needed.
smoke/ip_loadbalance_frr_test.sh (2)
14-16: Ensure run_id is set (ShellCheck SC2154)If _init_frr.sh doesn’t export run_id, provide a fallback to avoid empty interface names.
Example:
. $(dirname $0)/_init_frr.sh + +: "${run_id:=$(printf '%02d' $((RANDOM%90+10)))}" p0=${run_id}0 p1=${run_id}1 p2=${run_id}2
55-60: Optional: add fail-fast and brief assertionsConsider
set -euo pipefailat top and check pings’ exit status to tighten the smoke.modules/infra/cli/nexthop.c (1)
333-343: Bound-check member count during fillValidate group->n_members doesn’t exceed the allocated count before writing each entry.
Example:
+ uint32_t max = n_members; while ((n = ec_pnode_find_next(p, n, "MEMBER", false)) != NULL) { + if (group->n_members >= max) { errno = EOVERFLOW; goto out; } ... }modules/ip/datapath/ip_loadbalance.c (3)
4-9: Use the typed accessor and include the proper headerPrefer the GR_NH_TYPE_INFO accessor to avoid raw casts and to assert type at runtime.
Apply:
#include <gr_graph.h> #include <gr_ip4_datapath.h> +#include <gr_nh_control.h> #include <gr_mbuf.h> #include <gr_trace.h> -#include <gr_vec.h> @@ - g = (struct nexthop_info_group *)d->nh->info; + g = nexthop_info_group(d->nh);Also applies to: 35-36
10-13: Trim unused includesrte_mbuf.h is used; rte_ip.h and rte_fib.h appear unused here. Consider removing to keep deps tight.
-#include <rte_fib.h> -#include <rte_ip.h> #include <rte_mbuf.h>
42-42: Weights ignoredg->members[..].weight is unused; if weighted ECMP is intended, note TODO or implement stride/WRR.
smoke/ip_loadbalance_test.sh (2)
1-6: Fail fast for test stabilityEnable strict mode so failures surface early.
-#!/bin/bash +#!/bin/bash +set -euo pipefail
31-36: Optional: cleanup symmetryConsider deleting nh IDs 100/101 and the route after the test to leave a clean state if the process keeps running.
Also applies to: 56-56
frr/rt_grout.c (1)
292-297: Remove redundant assignmentsnh->ifindex and nh->vrf_id are already set before the switch; drop repeats in GR_NH_T_GROUP case.
modules/infra/control/gr_nh_control.h (1)
57-66: Add documentation comment clarifying heap ownership of members arrayThe free path is correctly implemented—
group_free()at line 843 unconditionally frees the members array viarte_free(). However, add a comment to thenexthop_info_groupstruct definition (gr_nh_control.h:62-65) noting thatmembersis heap-allocated and owned by this structure.GR_NH_TYPE_INFO(GR_NH_T_GROUP, nexthop_info_group, { uint32_t n_members; struct nh_group_member *members; // Heap-allocated; freed by group_free() });
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (18)
- frr/rt_grout.c(3 hunks)
- modules/infra/api/gr_nexthop.h(3 hunks)
- modules/infra/cli/gr_cli_iface.h(1 hunks)
- modules/infra/cli/nexthop.c(5 hunks)
- modules/infra/control/gr_nh_control.h(1 hunks)
- modules/infra/control/nexthop.c(5 hunks)
- modules/ip/cli/icmp.c(5 hunks)
- modules/ip/datapath/icmp_local_send.c(2 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/cli/icmp6.c(8 hunks)
- modules/ip6/datapath/icmp6_local_send.c(3 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- smoke/config_test.sh(2 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- smoke/ip_loadbalance_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- modules/ip6/datapath/meson.build
- modules/ip/datapath/meson.build
- modules/infra/cli/gr_cli_iface.h
- modules/ip/datapath/icmp_local_send.c
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch
- modules/infra/control/nexthop.c
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- modules/infra/control/gr_nh_control.h
- frr/rt_grout.c
- modules/ip6/datapath/icmp6_local_send.c
- modules/ip/datapath/ip_loadbalance.c
- modules/infra/api/gr_nexthop.h
- modules/ip/cli/icmp.c
- modules/ip6/cli/icmp6.c
- modules/ip6/datapath/ip6_loadbalance.c
- modules/infra/cli/nexthop.c
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/ip_loadbalance_test.sh
- smoke/ip_loadbalance_frr_test.sh
- smoke/config_test.sh
🧠 Learnings (2)
📚 Learning: 2025-10-02T07:42:42.135Z
Learnt from: rjarry
PR: DPDK/grout#326
File: modules/infra/control/graph.c:360-364
Timestamp: 2025-10-02T07:42:42.135Z
Learning: In the grout codebase, `gr_vec_free()` is a macro defined in `main/gr_vec.h` that automatically sets the vector pointer to NULL after freeing the memory, preventing double-free issues. The macro definition is: `#define gr_vec_free(v) ((v) ? free(__gr_vec_hdr(v)) : (void)0, (v) = NULL)`.
Applied to files:
- modules/infra/control/gr_nh_control.h
📚 Learning: 2025-10-08T21:22:45.922Z
Learnt from: rjarry
PR: DPDK/grout#334
File: modules/infra/control/worker.c:278-281
Timestamp: 2025-10-08T21:22:45.922Z
Learning: In the codebase, `gr_vec_add` is a macro that does not return any value and cannot fail. Do not suggest checking its return value or adding error handling around it.
Applied to files:
- modules/infra/control/gr_nh_control.h
🧬 Code graph analysis (10)
frr/rt_grout.c (1)
frr/zebra_dplane_grout.c (1)
grout_client_send_recv(335-372)
modules/ip6/datapath/icmp6_local_send.c (1)
api/gr_errno.h (1)
errno_set(9-12)
modules/ip/datapath/ip_loadbalance.c (5)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(522-549)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip6/datapath/ip6_loadbalance.c (1)
loadbalance_register(53-55)modules/ip/datapath/ip_output.c (1)
ip_output_register_nexthop_type(48-55)
smoke/ip_loadbalance_test.sh (1)
smoke/_init.sh (1)
netns_add(93-99)
modules/ip/cli/icmp.c (3)
cli/gr_cli.h (1)
arg_u16(64-70)modules/ip6/cli/icmp6.c (4)
sighandler(19-21)
icmp_send(23-138)
ping(140-175)
traceroute(177-206)modules/ip/control/icmp.c (1)
icmp_send(91-104)
modules/ip6/cli/icmp6.c (5)
api/gr_api.h (1)
gr_api_client_send_recv(41-52)cli/gr_cli.h (1)
arg_u16(64-70)modules/ip/cli/icmp.c (2)
icmp_send(31-128)
traceroute(159-180)modules/ip/control/icmp.c (1)
icmp_send(91-104)cli/ecoli.c (1)
arg_ip6(195-197)
modules/ip6/datapath/ip6_loadbalance.c (5)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(522-549)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip/datapath/ip_loadbalance.c (1)
loadbalance_register(53-55)modules/ip6/datapath/ip6_output.c (1)
ip6_output_register_nexthop_type(41-48)
smoke/ip_loadbalance_frr_test.sh (2)
smoke/_init_frr.sh (3)
create_interface(9-28)
set_ip_address(30-61)
set_ip_route(84-124)smoke/_init.sh (1)
netns_add(93-99)
smoke/config_test.sh (1)
modules/infra/control/gr_nh_control.h (1)
nexthop(22-28)
modules/infra/cli/nexthop.c (3)
cli/gr_cli.h (1)
arg_u32(72-78)api/gr_api.h (1)
gr_api_client_send_recv(41-52)cli/ecoli.c (1)
with_help(15-24)
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_test.sh
[warning] 5-5: Quote this to prevent word splitting.
(SC2046)
[warning] 7-7: run_id is referenced but not assigned.
(SC2154)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
- GitHub Check: deb
- GitHub Check: rpm
🔇 Additional comments (19)
smoke/config_test.sh (1)
45-45: LGTM: teardown cleans the groupDeletion aligns with setup and avoids leftovers.
modules/ip/cli/icmp.c (5)
36-36: IPv4 ICMP: ident propagation looks correctPassing ident into req and recv filters is consistent and minimal.
Also applies to: 50-50, 57-57
134-147: Ping: sensible defaults and parsingRandom ident default + optional IDENT parsing is fine. Call site updated accordingly.
Also applies to: 152-152
162-167: Traceroute: ident integrationSame pattern as ping; looks good.
Also applies to: 175-175
189-199: CLI: add IDENT to ping usage/helpUsage/help consistent with parsing (range 1..UINT16_MAX).
206-206: CLI: traceroute IDENT optionSyntax/help updated correctly.
Also applies to: 210-213
modules/ip6/datapath/icmp6_local_send.c (1)
105-109: Good: set fake RSS and flagSetting mbuf->hash.rss and RTE_MBUF_F_RX_RSS_HASH matches the load-balance path.
modules/ip6/cli/icmp6.c (5)
28-28: IPv6 ICMP: ident wired through send/recvAssignment and recv filter updated coherently.
Also applies to: 42-42, 50-50
144-156: Ping6: IDENT default + parsingRandom default, optional IDENT; call site updated. Looks good.
Also applies to: 170-170
180-188: Traceroute6: IDENT integrationMatches ping6 pattern; OK.
Also applies to: 201-201
215-229: CLI: ping6 usage/help updatedIDENT option surfaced properly; ranges align with arg_u16.
236-248: CLI: traceroute6 usage/help updatedConsistent with parsing.
modules/infra/cli/nexthop.c (3)
113-121: Formatter: concise group renderingPrinting id(weight) per member is fine.
138-143: Register group formatterGood integration with formatter registry.
575-575: Registering group formatter: OKFormatter is added alongside others.
modules/infra/api/gr_nexthop.h (1)
35-36: GROUP type exposure looks goodEnum value and type-name mapping are consistent with the rest of the API. LGTM.
Also applies to: 181-183
frr/rt_grout.c (1)
806-808: Group-path handoff is correctDelegating to grout_add_nexthop_group when group members are present is the right split. LGTM.
modules/ip6/datapath/ip6_loadbalance.c (2)
53-55: LGTM!Correctly registers the nexthop group type with ip6_output, following the same pattern as the IPv4 implementation.
57-73: LGTM!Node registration structure and callbacks are correctly configured. The edge mappings to
ip6_outputandip6_lb_no_nexthopare appropriate for load-balance routing.
| len = sizeof(*req) + sizeof(*group) + n_members * sizeof(group->members[0]); | ||
| if ((req = calloc(1, len)) == NULL) { | ||
| errno = ENOMEM; | ||
| goto out; | ||
| } | ||
|  | ||
| req->exist_ok = true; | ||
| req->nh.type = GR_NH_T_GROUP; | ||
| req->nh.origin = GR_NH_ORIGIN_USER; | ||
|  | ||
| if (arg_u32(p, "ID", &req->nh.nh_id) < 0 && errno != ENOENT) | ||
| goto out; | ||
|  | ||
| group = (struct gr_nexthop_info_group *)&req->nh.info; | ||
|  | ||
| while ((n = ec_pnode_find_next(p, n, "MEMBER", false)) != NULL) { | ||
| if (arg_u32(n, "NHID", &group->members[group->n_members].nh_id) < 0) | ||
| goto out; | ||
| if (arg_u32(n, "WEIGHT", &group->members[group->n_members].weight) < 0) { | ||
| if (errno == ENOENT) | ||
| group->members[group->n_members].weight = 1; | ||
| else | ||
| goto out; | ||
| } | ||
| group->n_members++; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: wrong pointer cast to nh.info; and size calculation needs overflow guard
- Casting with & yields pointer-to-array, not to first byte; results in misaligned writes.
- Guard against size_t overflow when computing len; optionally reject empty groups.
Apply:
- len = sizeof(*req) + sizeof(*group) + n_members * sizeof(group->members[0]);
+ size_t hdr = sizeof(*req) + sizeof(*group);
+ size_t msz = sizeof(((struct gr_nexthop_info_group *)0)->members[0]);
+ if (n_members == 0) { errno = EINVAL; goto out; }
+ if (n_members > (SIZE_MAX - hdr) / msz) { errno = EOVERFLOW; goto out; }
+ len = hdr + n_members * msz;
@@
- group = (struct gr_nexthop_info_group *)&req->nh.info;
+ group = (struct gr_nexthop_info_group *)req->nh.info;🤖 Prompt for AI Agents
In modules/infra/cli/nexthop.c around lines 318 to 343, the code computes len
and casts to a group pointer incorrectly and lacks overflow/bounds checks; fix
by (1) computing max_members = (SIZE_MAX - sizeof(*req) - sizeof(*group)) /
sizeof(group->members[0]) and verify n_members <= max_members (or set
errno=EINVAL and goto out) to prevent size_t overflow and reject excessively
large/empty groups, (2) compute len only after that check, allocate, and on
allocation failure set errno=ENOMEM and goto out as before, and (3) set group =
(struct gr_nexthop_info_group *)req->nh.info (i.e. cast the nh.info buffer
itself, not &req->nh.info) so writes target the first byte of the info buffer,
plus ensure you never increment group->n_members past max_members while parsing
MEMBER entries (set errno=EOVERFLOW/EINVAL and goto out if exceeded).
| grcli nexthop add group id 333 member 42 weight 102 | ||
| grcli nexthop add group id 333 member 45 member 47 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine into a single group-add; second call likely overwrites the first
GR_NH_ADD replaces the group's member list. Issuing two commands risks losing member 42/weight 102. Prefer one atomic add.
Apply:
-grcli nexthop add group id 333 member 42 weight 102
-grcli nexthop add group id 333 member 45 member 47
+grcli nexthop add group id 333 member 42 weight 102 member 45 member 47📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| grcli nexthop add group id 333 member 42 weight 102 | |
| grcli nexthop add group id 333 member 45 member 47 | |
| grcli nexthop add group id 333 member 42 weight 102 member 45 member 47 | 
🤖 Prompt for AI Agents
In smoke/config_test.sh around lines 19 to 20, the script issues two grcli
nexthop add group calls which will cause the second call to replace the group's
member list and potentially drop member 42/weight 102; instead, combine into a
single grcli nexthop add group command that includes all desired members and
preserves the weight for member 42 so the group is created atomically with
members 42 (weight 102), 45, and 47.
0ab2ae9    to
    acef676      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit message for icmp,icmp6: select nexthop based on ident field is off:
icmp(6)_local_send access the nexthop to select the proper gateway.
This can't work with the nexthop group. As we can't use the rss field,
use the "ident" field instead.
Add smoke test for loadbalance.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
That last sentence should be in the next commit.
        
          
                modules/infra/cli/gr_cli_iface.h
              
                Outdated
          
        
      | #pragma once | ||
|  | ||
| #include <gr_infra.h> | ||
| #include <gr_nexthop.h> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you move the include from modules/infra/cli/nexthop.c to here?
        
          
                modules/infra/cli/nexthop.c
              
                Outdated
          
        
      | if (arg_u32(p, "ID", &req->nh.nh_id) < 0 && errno != ENOENT) | ||
| goto out; | ||
|  | ||
| group = (struct gr_nexthop_info_group *)&req->nh.info; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop the & operator. info is already an array (i.e. a pointer).
        
          
                modules/infra/control/nexthop.c
              
                Outdated
          
        
      | if (info->n_members == 0) { | ||
| rte_free(info->members); | ||
| info->members = NULL; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be done when freeing the nexthop group?
78f2037    to
    427cc86      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
frr/rt_grout.c (1)
581-612: Guard len against size_t overflow in grout_add_nexthop_groupcount-based arithmetic can overflow and under-allocate.
Apply:
- size_t len; - - len = sizeof(*req) + sizeof(*group) - + dplane_ctx_get_nhe_nh_grp_count(ctx) * sizeof(group->members[0]); - if ((req = calloc(1, len)) == NULL) - return ZEBRA_DPLANE_REQUEST_FAILURE; + size_t len, count, hdr, msz; + count = dplane_ctx_get_nhe_nh_grp_count(ctx); + hdr = sizeof(*req) + sizeof(struct gr_nexthop_info_group); + msz = sizeof(((struct gr_nexthop_info_group *)0)->members[0]); + if (count > (SIZE_MAX - hdr) / msz) { + return ZEBRA_DPLANE_REQUEST_FAILURE; + } + len = hdr + count * msz; + if ((req = calloc(1, len)) == NULL) + return ZEBRA_DPLANE_REQUEST_FAILURE; @@ - group->n_members = dplane_ctx_get_nhe_nh_grp_count(ctx); + group->n_members = count;modules/infra/control/nexthop.c (1)
488-512: Fix swap-and-shrink loop to avoid skipping elements.The loop increments
iafter swapping the last element into positioni, which skips checking the swapped-in element. If duplicates exist, they won't be removed. This was flagged in a previous review but the fix is not visible in the current code.Apply the previously suggested fix:
- for (uint32_t i = 0; i < info->n_members; i++) { + for (uint32_t i = 0; i < info->n_members; ) { if (info->members[i].nh == nh) { + nexthop_decref(info->members[i].nh); info->members[i].nh = info->members[info->n_members - 1].nh; info->members[i].weight = info->members[info->n_members - 1].weight; info->n_members--; + continue; } + i++; }modules/ip/datapath/icmp_local_send.c (1)
49-54: Guard against NULL group member.
g->members[ident % g->n_members].nhcan be NULL. Dereferencing at line 65 will crash. Also, cast should preserve const-correctness.Apply:
- if (gw->type == GR_NH_T_GROUP) { - struct nexthop_info_group *g = (struct nexthop_info_group *)gw->info; - if (g->n_members == 0) - return errno_set(EHOSTUNREACH); - gw = g->members[ident % g->n_members].nh; - } + if (gw->type == GR_NH_T_GROUP) { + const struct nexthop_info_group *g = (const struct nexthop_info_group *)gw->info; + if (g->n_members == 0) + return errno_set(EHOSTUNREACH); + const struct nexthop *m = g->members[ident % g->n_members].nh; + if (m == NULL) + return errno_set(EHOSTUNREACH); + gw = m; + }modules/ip6/datapath/ip6_loadbalance.c (1)
20-51: Guard against NULL group member and missing RSS.Line 42:
g->members[mbuf->hash.rss % g->n_members].nhcan be NULL. Also, the TODO at line 41 notes RSS hash may not be populated by drivers.Add NULL check and RSS fallback:
if (unlikely(g->n_members == 0)) { edge = NO_NEXTHOP; goto next; } - // TODO: increment xstat on ! mbuf->ol_flags & RTE_MBUF_F_RX_RSS_HASH - d->nh = g->members[mbuf->hash.rss % g->n_members].nh; + uint32_t idx = 0; + if (mbuf->ol_flags & RTE_MBUF_F_RX_RSS_HASH) { + idx = mbuf->hash.rss % g->n_members; + } + // TODO: increment xstat when RSS is missing + const struct nexthop *m = g->members[idx].nh; + if (m == NULL) { + edge = NO_NEXTHOP; + goto next; + } + d->nh = m;
🧹 Nitpick comments (2)
modules/infra/cli/nexthop.c (1)
487-501: Require at least one MEMBER in CLI grammarEnforce non-empty groups at parse-time.
- "group [(id ID)] (member MEMBER)*", + "group [(id ID)] (member MEMBER)+",Based on learnings.
smoke/ip_loadbalance_test.sh (1)
5-9: Set a default run_id to avoid empty/duplicate namesPrevents collisions and accidental bare “0/1/2” interface names when run_id isn’t set.
-. $(dirname $0)/_init.sh +. "$(dirname "$0")/_init.sh" + +: "${run_id:=gr$$}"As per coding guidelines.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (17)
- frr/rt_grout.c(3 hunks)
- modules/infra/api/gr_nexthop.h(3 hunks)
- modules/infra/cli/nexthop.c(5 hunks)
- modules/infra/control/gr_nh_control.h(1 hunks)
- modules/infra/control/nexthop.c(5 hunks)
- modules/ip/cli/icmp.c(5 hunks)
- modules/ip/datapath/icmp_local_send.c(2 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/cli/icmp6.c(8 hunks)
- modules/ip6/datapath/icmp6_local_send.c(2 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- smoke/config_test.sh(2 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- smoke/ip_loadbalance_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- modules/infra/control/gr_nh_control.h
- modules/infra/api/gr_nexthop.h
- modules/ip6/datapath/icmp6_local_send.c
- modules/ip/datapath/meson.build
- smoke/config_test.sh
- modules/ip/datapath/ip_loadbalance.c
- modules/ip6/datapath/meson.build
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- modules/infra/control/nexthop.c
- modules/ip/cli/icmp.c
- modules/ip6/datapath/ip6_loadbalance.c
- modules/ip/datapath/icmp_local_send.c
- frr/rt_grout.c
- modules/ip6/cli/icmp6.c
- modules/infra/cli/nexthop.c
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/ip_loadbalance_frr_test.sh
- smoke/ip_loadbalance_test.sh
🧬 Code graph analysis (9)
modules/infra/control/nexthop.c (4)
api/gr_errno.h (2)
errno_set(9-12)
errno_set_null(14-17)modules/infra/datapath/main_loop.c (1)
gr_datapath_rcu(300-302)main/event.c (1)
gr_event_register_serializer(34-54)main/module.c (1)
gr_register_module(49-61)
modules/ip/cli/icmp.c (6)
api/gr_api.h (1)
gr_api_client_send_recv(41-52)cli/gr_cli.h (1)
arg_u16(64-70)modules/ip6/cli/icmp6.c (4)
sighandler(19-21)
icmp_send(23-138)
ping(140-175)
traceroute(177-206)cli/interact.c (1)
sighandler(22-22)modules/ip/control/icmp.c (1)
icmp_send(91-104)cli/ecoli.c (1)
arg_ip4(191-193)
smoke/ip_loadbalance_frr_test.sh (2)
smoke/_init_frr.sh (3)
create_interface(9-28)
set_ip_address(30-61)
set_ip_route(84-124)smoke/_init.sh (1)
netns_add(93-99)
modules/ip6/datapath/ip6_loadbalance.c (5)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(522-549)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip/datapath/ip_loadbalance.c (1)
loadbalance_register(53-55)modules/ip6/datapath/ip6_output.c (1)
ip6_output_register_nexthop_type(41-48)
modules/ip/datapath/icmp_local_send.c (1)
api/gr_errno.h (1)
errno_set(9-12)
frr/rt_grout.c (1)
frr/zebra_dplane_grout.c (1)
grout_client_send_recv(335-372)
modules/ip6/cli/icmp6.c (5)
api/gr_api.h (1)
gr_api_client_send_recv(41-52)cli/gr_cli.h (1)
arg_u16(64-70)modules/ip/cli/icmp.c (2)
icmp_send(31-128)
traceroute(159-180)modules/ip/control/icmp.c (1)
icmp_send(91-104)cli/ecoli.c (1)
arg_ip6(195-197)
modules/infra/cli/nexthop.c (3)
cli/gr_cli.h (1)
arg_u32(72-78)api/gr_api.h (1)
gr_api_client_send_recv(41-52)cli/ecoli.c (1)
with_help(15-24)
smoke/ip_loadbalance_test.sh (1)
smoke/_init.sh (1)
netns_add(93-99)
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
smoke/ip_loadbalance_test.sh
[warning] 5-5: Quote this to prevent word splitting.
(SC2046)
[warning] 7-7: run_id is referenced but not assigned.
(SC2154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
- GitHub Check: build-cross-aarch64
- GitHub Check: rpm
- GitHub Check: deb
🔇 Additional comments (16)
modules/ip6/cli/icmp6.c (1)
28-31: ICMPv6 ident propagation LGTMident is threaded consistently through send/recv and CLI; matches IPv4 path. No issues spotted.
Also applies to: 41-51, 140-171, 177-205, 211-249
frr/rt_grout.c (1)
188-206: GR_NH_T_GROUP mapping: ensure nh->type is set or unreachableIn grout_gr_nexthop_to_frr_nexthop(), the GROUP case doesn’t set nh->type. If this code path is reachable (nh_id==0, group inline), type must be set (e.g., NEXTHOP_TYPE_IFINDEX) or rejected earlier.
Would you confirm GROUP never flows through this path with nh_id==0? If it can, I can propose a small patch to set a sane nh->type. As per coding guidelines.
Also applies to: 292-299
smoke/ip_loadbalance_frr_test.sh (1)
12-12: Source common init and set a safe default run_idnetns_add is defined in _init.sh (not in _init_frr.sh in provided snippets). Also, guard against unset run_id.
Apply:
-. $(dirname $0)/_init_frr.sh +. "$(dirname "$0")/_init_frr.sh" +. "$(dirname "$0")/_init.sh" + +: "${run_id:=gr$$}"As per coding guidelines.
Likely an incorrect or invalid review comment.
modules/ip/cli/icmp.c (4)
31-37: LGTM: ident parameter added cleanly.The signature extension is straightforward and correctly propagates the identifier through the function.
130-157: LGTM: ping() ident integration is correct.Default random ident with optional CLI override, proper error handling for the optional parameter.
159-180: LGTM: traceroute() ident handling matches ping() pattern.Consistent implementation with proper error handling.
182-218: LGTM: CLI documentation is clear and complete.Help text properly documents the optional IDENT parameter and its default random behavior.
modules/infra/control/nexthop.c (6)
308-325: LGTM: GROUP type registration follows existing pattern.Proper validation and duplicate detection in place.
327-366: LGTM: nexthop_new correctly validates GROUP type.
825-836: LGTM: group_equal correctly compares count before members.The fix from the previous review (comparing n_members first) is present.
838-844: LGTM: group_free properly releases member references.Cleanup correctly decrefs all members before freeing the array.
846-885: LGTM: group_import_info allocation and refcount handling fixed.The fixes from previous reviews are present:
- Allocation size is correct (line 853)
- Decref uses old array
tmp(line 880)
887-927: LGTM: group_to_api and registration are correct.Output length set correctly on error (line 896), and group operations properly registered.
modules/ip/datapath/icmp_local_send.c (1)
115-118: RSS hash approach noted; defer algorithm work as planned.Setting fake RSS for ECMP is reasonable for this PR. The PR description mentions load balance algorithm improvements will come later.
modules/ip6/datapath/ip6_loadbalance.c (2)
53-55: LGTM: registration follows standard pattern.
57-73: LGTM: node declaration and registration are correct.Standard graph node setup with proper edge definitions and drop path registration.
| size_t len; | ||
|  | ||
| while ((n = ec_pnode_find_next(p, n, "MEMBER", false)) != NULL) { | ||
| n_members++; | ||
| } | ||
| n = NULL; | ||
|  | ||
| len = sizeof(*req) + sizeof(*group) + n_members * sizeof(group->members[0]); | ||
| if ((req = calloc(1, len)) == NULL) { | ||
| errno = ENOMEM; | ||
| goto out; | ||
| } | ||
|  | ||
| req->exist_ok = true; | ||
| req->nh.type = GR_NH_T_GROUP; | ||
| req->nh.origin = GR_NH_ORIGIN_USER; | ||
|  | ||
| if (arg_u32(p, "ID", &req->nh.nh_id) < 0 && errno != ENOENT) | ||
| goto out; | ||
|  | ||
| group = (struct gr_nexthop_info_group *)&req->nh.info; | ||
|  | ||
| while ((n = ec_pnode_find_next(p, n, "MEMBER", false)) != NULL) { | ||
| if (arg_u32(n, "NHID", &group->members[group->n_members].nh_id) < 0) | ||
| goto out; | ||
| if (arg_u32(n, "WEIGHT", &group->members[group->n_members].weight) < 0) { | ||
| if (errno == ENOENT) | ||
| group->members[group->n_members].weight = 1; | ||
| else | ||
| goto out; | ||
| } | ||
| group->n_members++; | ||
| } | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix unsafe allocation math and incorrect cast; also search MEMBERS recursively
- size_t overflow possible in len computation.
- group cast uses &req->nh.info (pointer-to-array), causing misaligned writes.
- Use descend=true in ec_pnode_find_next so nested MEMBER nodes are found.
Apply:
-	uint32_t n_members = 0;
-	size_t len;
+	uint32_t n_members = 0;
+	size_t len, hdr, msz;
@@
-	while ((n = ec_pnode_find_next(p, n, "MEMBER", false)) != NULL) {
+	while ((n = ec_pnode_find_next(p, n, "MEMBER", true)) != NULL) {
 		n_members++;
 	}
 	n = NULL;
 
-	len = sizeof(*req) + sizeof(*group) + n_members * sizeof(group->members[0]);
+	hdr = sizeof(*req) + sizeof(struct gr_nexthop_info_group);
+	msz = sizeof(((struct gr_nexthop_info_group *)0)->members[0]);
+	if (n_members == 0 || n_members > (SIZE_MAX - hdr) / msz) {
+		errno = EINVAL;
+		goto out;
+	}
+	len = hdr + n_members * msz;
@@
-	group = (struct gr_nexthop_info_group *)&req->nh.info;
+	group = (struct gr_nexthop_info_group *)req->nh.info;
@@
-	while ((n = ec_pnode_find_next(p, n, "MEMBER", false)) != NULL) {
+	while ((n = ec_pnode_find_next(p, n, "MEMBER", true)) != NULL) {Based on learnings.
Also applies to: 319-333, 334-344
427cc86    to
    c0f041b      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
modules/ip/datapath/icmp_local_send.c (1)
49-54: Guard against NULL member and preserve const-correctness.The selected group member's
nhpointer is not checked for NULL before use, which will cause a crash at line 65 or 66. Also, casting away const fromgw->infois unnecessary.Apply this diff:
if (gw->type == GR_NH_T_GROUP) { - struct nexthop_info_group *g = (struct nexthop_info_group *)gw->info; + const struct nexthop_info_group *g = (const struct nexthop_info_group *)gw->info; if (g->n_members == 0) return errno_set(EHOSTUNREACH); - gw = g->members[ident % g->n_members].nh; + const struct nexthop *m = g->members[ident % g->n_members].nh; + if (m == NULL) + return errno_set(EHOSTUNREACH); + gw = m; }modules/infra/cli/nexthop.c (1)
319-323: Guard against size_t overflow in allocation.The length calculation can overflow when
n_membersis large, potentially causing under-allocation and buffer overflow.Apply this diff:
+ size_t hdr = sizeof(*req) + sizeof(*group); + size_t msz = sizeof(group->members[0]); + if (n_members > (SIZE_MAX - hdr) / msz) { + errno = EOVERFLOW; + goto out; + } len = sizeof(*req) + sizeof(*group) + n_members * sizeof(group->members[0]);modules/infra/control/nexthop.c (1)
488-509: Fix duplicate-skip on swap in nh_groups_remove_member (do not decref here)After swap-and-shrink, the loop advances i and skips the swapped-in element, leaving duplicates. Also, this function is called only when nh->ref_count == 0 (final decref), so decref’ing nh here would underflow. Use a while-loop and re-check the swapped-in slot; no decref.
Apply:
static void nh_groups_remove_member(const struct nexthop *nh) { @@ - for (uint32_t i = 0; i < info->n_members; i++) { - if (info->members[i].nh == nh) { - info->members[i].nh = info->members[info->n_members - 1].nh; - info->members[i].weight = info->members[info->n_members - 1].weight; - info->n_members--; - } - } + for (uint32_t i = 0; i < info->n_members; ) { + if (info->members[i].nh == nh) { + // remove by swap-with-last; no decref here (nh is at final decref) + info->members[i] = info->members[info->n_members - 1]; + info->n_members--; + continue; // re-check swapped-in element + } + i++; + }
🧹 Nitpick comments (6)
smoke/ip_loadbalance_frr_test.sh (1)
12-12: Quote command substitution to prevent word splitting.Apply this diff:
-. $(dirname $0)/_init_frr.sh +. "$(dirname "$0")/_init_frr.sh"smoke/ip_loadbalance_test.sh (1)
5-5: Quote command substitution to prevent word splitting.Apply this diff:
-. $(dirname $0)/_init.sh +. "$(dirname "$0")/_init.sh"modules/infra/control/nexthop.c (1)
842-882: Optional validation in group_import_infoConsider rejecting zero-weight members and optionally deduplicating identical nh entries (coalescing weights), which simplifies datapath selection.
Example guard:
for (uint32_t i = 0; i < group->n_members; i++) { struct nexthop *nh = nexthop_lookup_by_id(group->members[i].nh_id); if (nh) { - members[i].nh = nh; - members[i].weight = group->members[i].weight; + if (group->members[i].weight == 0) + { rte_free(members); return errno_set(EINVAL); } + members[i].nh = nh; + members[i].weight = group->members[i].weight; } else { rte_free(members); return errno_set(ENOENT); } }modules/ip6/cli/icmp6.c (3)
180-181: Do the same IDENT non‑zero clamp for traceroute.Keep behavior consistent across ping and traceroute.
- uint16_t ident = random(); + uint16_t ident = (uint16_t)random(); @@ - if ((ret = arg_u16(p, "IDENT", &ident)) < 0 && ret != ENOENT) + if ((ret = arg_u16(p, "IDENT", &ident)) < 0 && ret != ENOENT) return CMD_ERROR; + if (ident == 0) + ident = 1; @@ - ret = icmp_send(c, &req, 0, 255, ident, true); + ret = icmp_send(c, &req, 0, 255, ident, true);Also applies to: 187-189, 201-202
215-216: CLI help nits: capitalize ICMP and clarify delay text.Minor copy polish for consistency with IPv4 CLI and protocol naming.
- "DEST [vrf VRF] [count COUNT] [delay DELAY] [iface IFACE] [ident IDENT]", + "DEST [vrf VRF] [count COUNT] [delay DELAY] [iface IFACE] [ident IDENT]", @@ - with_help("Delay in ms between icmp6 echo.", ec_node_uint("DELAY", 0, 10000, 10)), + with_help("Delay in ms between ICMPv6 echo requests.", ec_node_uint("DELAY", 0, 10000, 10)), @@ - with_help( - "Icmp ident field (default: random).", + with_help( + "ICMP identifier field (default: random).", ec_node_uint("IDENT", 1, UINT16_MAX, 10) )Also applies to: 225-229
236-237: Mirror the help-text tweaks for traceroute.Keep wording consistent across commands.
- "DEST [vrf VRF] [iface IFACE] [ident IDENT]", + "DEST [vrf VRF] [iface IFACE] [ident IDENT]", @@ - with_help( - "Icmp ident field (default: random).", + with_help( + "ICMP identifier field (default: random).", ec_node_uint("IDENT", 1, UINT16_MAX, 10) )Also applies to: 244-248
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
- docs/graph.svgis excluded by- !**/*.svg
📒 Files selected for processing (17)
- frr/rt_grout.c(3 hunks)
- modules/infra/api/gr_nexthop.h(3 hunks)
- modules/infra/cli/nexthop.c(5 hunks)
- modules/infra/control/gr_nh_control.h(1 hunks)
- modules/infra/control/nexthop.c(5 hunks)
- modules/ip/cli/icmp.c(5 hunks)
- modules/ip/datapath/icmp_local_send.c(2 hunks)
- modules/ip/datapath/ip_loadbalance.c(1 hunks)
- modules/ip/datapath/meson.build(1 hunks)
- modules/ip6/cli/icmp6.c(8 hunks)
- modules/ip6/datapath/icmp6_local_send.c(2 hunks)
- modules/ip6/datapath/ip6_loadbalance.c(1 hunks)
- modules/ip6/datapath/meson.build(1 hunks)
- smoke/config_test.sh(2 hunks)
- smoke/ip_loadbalance_frr_test.sh(1 hunks)
- smoke/ip_loadbalance_test.sh(1 hunks)
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- modules/ip/datapath/meson.build
- frr/rt_grout.c
- modules/ip/cli/icmp.c
- subprojects/packagefiles/frr/meson-add-dependency-definition.patch
- smoke/config_test.sh
- modules/ip6/datapath/icmp6_local_send.c
- modules/ip6/datapath/ip6_loadbalance.c
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}
⚙️ CodeRabbit configuration file
**/*.{c,h}: -gr_vec_*()functions cannot fail. No need to check their return value.
gr_vec_free(x)always setsx = NULL. There is no risk of double free.
ec_node_*()functions consume theirec_nodearguments. No leaks on error.
rte_node->ctxis an uint8_t array of size 16, not a pointer.- Don't suggest to replace
assert()with graceful error checking.- We compile with
-std=gnu2x. Unnamed parameters in function definitions are valid.
Files:
- modules/infra/api/gr_nexthop.h
- modules/infra/control/gr_nh_control.h
- modules/infra/cli/nexthop.c
- modules/ip6/cli/icmp6.c
- modules/ip/datapath/ip_loadbalance.c
- modules/infra/control/nexthop.c
- modules/ip/datapath/icmp_local_send.c
**/*.sh
⚙️ CodeRabbit configuration file
**/*.sh: - Don't bother about unquoted shell variables.
Files:
- smoke/ip_loadbalance_frr_test.sh
- smoke/ip_loadbalance_test.sh
🧠 Learnings (2)
📚 Learning: 2025-10-02T07:42:42.135Z
Learnt from: rjarry
PR: DPDK/grout#326
File: modules/infra/control/graph.c:360-364
Timestamp: 2025-10-02T07:42:42.135Z
Learning: In the grout codebase, `gr_vec_free()` is a macro defined in `main/gr_vec.h` that automatically sets the vector pointer to NULL after freeing the memory, preventing double-free issues. The macro definition is: `#define gr_vec_free(v) ((v) ? free(__gr_vec_hdr(v)) : (void)0, (v) = NULL)`.
Applied to files:
- modules/infra/control/gr_nh_control.h
📚 Learning: 2025-10-08T21:22:45.922Z
Learnt from: rjarry
PR: DPDK/grout#334
File: modules/infra/control/worker.c:278-281
Timestamp: 2025-10-08T21:22:45.922Z
Learning: In the codebase, `gr_vec_add` is a macro that does not return any value and cannot fail. Do not suggest checking its return value or adding error handling around it.
Applied to files:
- modules/infra/control/gr_nh_control.h
🧬 Code graph analysis (7)
modules/infra/cli/nexthop.c (2)
cli/gr_cli.h (1)
arg_u32(72-78)api/gr_api.h (1)
gr_api_client_send_recv(41-52)
modules/ip6/cli/icmp6.c (4)
api/gr_api.h (1)
gr_api_client_send_recv(41-52)cli/gr_cli.h (1)
arg_u16(64-70)modules/ip/cli/icmp.c (2)
icmp_send(31-128)
traceroute(159-180)cli/ecoli.c (1)
arg_ip6(195-197)
modules/ip/datapath/ip_loadbalance.c (5)
modules/infra/datapath/gr_mbuf.h (1)
gr_mbuf_is_traced(52-54)modules/infra/datapath/trace.c (1)
gr_mbuf_trace_add(522-549)modules/infra/control/gr_graph.h (1)
rte_node_enqueue_x1(22-25)modules/ip6/datapath/ip6_loadbalance.c (1)
loadbalance_register(53-55)modules/ip/datapath/ip_output.c (1)
ip_output_register_nexthop_type(48-55)
modules/infra/control/nexthop.c (3)
api/gr_errno.h (2)
errno_set(9-12)
errno_set_null(14-17)modules/infra/datapath/main_loop.c (1)
gr_datapath_rcu(300-302)main/event.c (1)
gr_event_register_serializer(34-54)
modules/ip/datapath/icmp_local_send.c (1)
api/gr_errno.h (1)
errno_set(9-12)
smoke/ip_loadbalance_frr_test.sh (2)
smoke/_init_frr.sh (3)
create_interface(9-28)
set_ip_address(30-61)
set_ip_route(84-124)smoke/_init.sh (1)
netns_add(93-99)
smoke/ip_loadbalance_test.sh (1)
smoke/_init.sh (1)
netns_add(93-99)
🪛 Shellcheck (0.11.0)
smoke/ip_loadbalance_frr_test.sh
[warning] 12-12: Quote this to prevent word splitting.
(SC2046)
[warning] 14-14: run_id is referenced but not assigned.
(SC2154)
smoke/ip_loadbalance_test.sh
[warning] 5-5: Quote this to prevent word splitting.
(SC2046)
[warning] 7-7: run_id is referenced but not assigned.
(SC2154)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
- GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
- GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
- GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
- GitHub Check: build-cross-aarch64
- GitHub Check: rpm
- GitHub Check: deb
🔇 Additional comments (23)
modules/ip6/datapath/meson.build (1)
13-13: LGTM!Addition of
ip6_loadbalance.ccorrectly integrates the new IPv6 load-balancing node into the build.modules/infra/api/gr_nexthop.h (3)
35-35: LGTM!New enum value
GR_NH_T_GROUPis properly defined for ECMP support.
103-112: LGTM!Group nexthop structures are well-designed with proper flexible array member usage for dynamic member count.
181-182: LGTM!Type name mapping for
GR_NH_T_GROUPis consistent with existing patterns.modules/ip/datapath/icmp_local_send.c (1)
115-118: LGTM!RSS hash setup correctly enables ECMP/bond selection for locally-generated packets. The
RTE_MBUF_F_RX_RSS_HASHflag properly indicates the hash is valid.smoke/ip_loadbalance_frr_test.sh (1)
14-60: LGTM!Test topology correctly exercises FRR-based ECMP by creating multiple paths and validating route switching behavior.
modules/infra/control/gr_nh_control.h (2)
57-60: LGTM!Group member structure is clean and supports future weighted ECMP.
62-65: LGTM!Group type info registration follows the established pattern and correctly defines the internal structure.
modules/infra/cli/nexthop.c (3)
114-125: LGTM!Group formatter correctly displays members with proper uint32_t format specifiers.
484-501: LGTM!CLI command registration for nexthop groups is well-structured with proper validation ranges.
576-576: LGTM!Group formatter registration correctly integrated into module initialization.
modules/ip/datapath/ip_loadbalance.c (2)
53-55: LGTM!Nexthop type registration correctly wires
GR_NH_T_GROUPto the load-balancing node.
57-73: LGTM!Node registration follows standard patterns with proper edge and drop path definitions.
smoke/ip_loadbalance_test.sh (1)
7-56: LGTM!Test comprehensively validates ECMP functionality with both locally-generated and externally-generated traffic, including proper cleanup.
modules/infra/control/nexthop.c (8)
316-325: LGTM: GROUP type registration addedCase added to nexthop_type_ops_register; guards and duplicate detection preserved.
332-343: LGTM: GROUP allowed in nexthop_newType validation updated; no behavior change otherwise.
536-544: LGTM: cleanup order on final decrefRemoving nh from any groups before id unmap and RCU sync is the right sequencing.
821-833: LGTM: group_equal checks size, order, and weightsDeterministic equality; count comparison addresses prior risk.
834-841: LGTM: group_free decrefs members and frees arrayHandles empty/null safely.
883-907: LGTM: group_to_api length and OOM handlingCorrect total length computation; sets *len=0 on malloc failure.
908-913: LGTM: group ops tableEqual/free/import_info/to_api wired as expected.
915-923: LGTM: registration in initSerializer, module, telemetry, and GROUP ops registered; ordering consistent with L3.
modules/ip6/cli/icmp6.c (1)
28-30: IDENT plumbing looks correct and matches IPv4 flow.Passing ident into icmp_send and setting req->ident and reply_req.ident ensures stable selection and reply matching; parity with modules/ip/cli/icmp.c. Good.
Also applies to: 41-46, 50-53
| g = (struct nexthop_info_group *)d->nh->info; | ||
| edge = OUTPUT; | ||
| if (unlikely(g->n_members == 0)) { | ||
| edge = NO_NEXTHOP; | ||
| goto next; | ||
| } | ||
| // TODO: increment xstat on ! mbuf->ol_flags & RTE_MBUF_F_RX_RSS_HASH | ||
| d->nh = g->members[mbuf->hash.rss % g->n_members].nh; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against NULL member pointer.
The selected group member's nh can be NULL, which will cause crashes in downstream nodes.
Apply this diff:
 		d = ip_output_mbuf_data(mbuf);
-		g = (struct nexthop_info_group *)d->nh->info;
+		const struct nexthop_info_group *g = (const struct nexthop_info_group *)d->nh->info;
 		edge = OUTPUT;
 		if (unlikely(g->n_members == 0)) {
 			edge = NO_NEXTHOP;
 			goto next;
 		}
 		// TODO: increment xstat on ! mbuf->ol_flags & RTE_MBUF_F_RX_RSS_HASH
-		d->nh = g->members[mbuf->hash.rss % g->n_members].nh;
+		const struct nexthop *nh = g->members[mbuf->hash.rss % g->n_members].nh;
+		if (unlikely(nh == NULL)) {
+			edge = NO_NEXTHOP;
+			goto next;
+		}
+		d->nh = nh;🤖 Prompt for AI Agents
In modules/ip/datapath/ip_loadbalance.c around lines 35 to 42, the code assigns
d->nh from g->members[mbuf->hash.rss % g->n_members].nh without checking that
the chosen member or its nh is non-NULL; add a guard after computing the index
to retrieve the member pointer, verify the member and member->nh are non-NULL,
and if NULL set edge = NO_NEXTHOP and goto next (mirror the existing empty-group
handling) instead of assigning d->nh; ensure no dereference occurs before the
check.
| uint16_t ident = random(); | ||
| uint16_t msdelay = 1000; | ||
| const char *str; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clamp default IDENT to non‑zero to match CLI contract.
random() can produce 0; CLI range excludes 0. Clamp before calling icmp_send.
Apply:
   uint16_t count = UINT16_MAX;
-  uint16_t ident = random();
+  uint16_t ident = (uint16_t)random();
@@
-  if ((ret = arg_u16(p, "IDENT", &ident)) < 0 && ret != ENOENT)
+  if ((ret = arg_u16(p, "IDENT", &ident)) < 0 && ret != ENOENT)
     return CMD_ERROR;
+  if (ident == 0)
+    ident = 1;
@@
-  ret = icmp_send(c, &req, msdelay, count, ident, false);
+  ret = icmp_send(c, &req, msdelay, count, ident, false);Also applies to: 156-157, 170-171
🤖 Prompt for AI Agents
In modules/ip6/cli/icmp6.c around lines 144-146 (and likewise adjust occurrences
at 156-157 and 170-171), the default IDENT is set directly from random() but
random() may return 0 while the CLI contract excludes 0; clamp IDENT to a
non-zero value before using it (e.g., call random(), if result == 0 set ident to
1 or retry until non-zero) and then pass the clamped ident to icmp_send; ensure
similar clamping logic is applied at the other two locations mentioned.
Add nexthop type "GROUP", which is an aggregation of multiple nexthops. In a group, nexthop are added with a relative weight: grout# nexthop add group id 333 members nh 45 weight 3 nh 42 grout# nexthop type group VRF ID ORIGIN IFACE TYPE INFO 0 333 user group id(45/3) id(42/1) Signed-off-by: Christophe Fontaine <[email protected]> Reviewed-by: Robin Jarry <[email protected]>
Setting the ident field in the ICMP request message is useful for debugging and bringing stability in the future tests. Signed-off-by: Christophe Fontaine <[email protected]> Reviewed-by: Robin Jarry <[email protected]>
icmp(6)_local_send access the nexthop to select the proper gateway. This can't work with the nexthop group. As we can't use the rss field, use the "ident" field instead. Signed-off-by: Christophe Fontaine <[email protected]> Reviewed-by: Robin Jarry <[email protected]>
Using the configured nexthop GR_NH_T_GROUP, we select one of the nexthop based on the input RSS. As of now, we consider all weights being equal to "1". A further commit will take into account the route weight. Add smoke test for ip_loadbalance node. Signed-off-by: Christophe Fontaine <[email protected]> Reviewed-by: Robin Jarry <[email protected]>
Using the new nexthop type group, allow multiple nexthop
as a gateway from frr.
The following zebra conf:
!
ip route 192.168.1.20/32 cv0
ip route 192.168.1.20/32 192.168.210.3
ip route 192.168.1.20/32 192.168.210.254
!
translates into these rib nexthops:
ID: 35 (zebra)
     RefCnt: 2
     Uptime: 00:00:07
     VRF: default(No AFI)
     Nexthop Count: 3
     Valid, Installed
     Depends: (20) (24) (30)
           is directly connected, cv0 (vrf default), weight 1
           via 192.168.210.3, cv0 (vrf default), weight 1
           via 192.168.210.254, cv0 (vrf default), weight 1
With grout displays the proper NHs:
grout# nexthop type group
VRF  ID  ORIGIN  IFACE  TYPE   INFO
0    35  user           group  id(20/1) id(24/1) id(30/1)
Signed-off-by: Christophe Fontaine <[email protected]>
Reviewed-by: Robin Jarry <[email protected]>
    
Summary by CodeRabbit
New Features
CLI
Tests